Overlay the average trend on group by trends using Proc sgplot - sas

I want to create a line graph that includes the overall trend of a disease rate and the specific trends for males and females. I use the following code for to create the group by trends. How to add he average trend to this line graph. Thanks for your help.
proc sgplot data=have ;
vline year/response=disease_rate group=sex stat=mean datalabel=disease_rate ;
yaxis values=(0,1) label="Percentage";
run;

Here's an example of summarizing it and then displaying it on the graph. There are more than one way to do this though, this is just one.
data have;
set sashelp.heart(in=a);
year=round(2021-ageAtStart, 10);
disease_rate= status="Dead";
run;
proc means data=have mean noprint;
class sex year;
types sex sex*year;
var disease_rate;
output out=summary_stats mean=average_value;
run;
proc sort data=summary_stats;
by sex year;
run;
data graph_data;
merge summary_stats(where=(_type_=2) rename=average_value=mean_sex_year)
summary_stats(where=(_type_=3) rename=average_value = mean_sex);
by sex;
format mean_sex: percent12.1;
run;
proc sgplot data=graph_data ;
*where year > 1990;
vline year/response=mean_sex_year group=sex stat=mean datalabel=mean_sex_year ;
vline year/response=mean_sex group=sex stat=mean datalabel=mean_sex ;
run;

Use series instead of vline so that you can overlay a regression on top of it to get an average trend line. For example:
proc sql;
create table have as
select date
, region
, sum(sale) as sale
from sashelp.pricedata
group by region, date
order by region, date
;
quit;
proc sgplot data=have;
series x=date y=sale / group=region;
reg x=date y=sale / group=region;
xaxis fitpolicy=rotatethin;
run;

Related

SAS iterpolation : alternative to proc expand

From annual data :
I would like to create the the data per day but I can't use the proc expand because the SAS ETS is not available.
Thank you for your suggestions.
Something like this is a basic approach perhaps:
create a list of dates for interpolation
merge have data (shown above, not included in code below)
Plot to see if linear pattern, (looks somewhat exponential/curved)
run linear regression, saving predicted values
plot interpolated values against actual values
data years;
do date='30Jun2017'd to '30Jun2022'd;
output;
end;
run;
data have;
merge years have;
by date;
format date date9.;
run;
proc sgplot data=have;
series x=date y=px_last;
run;
proc reg data=have plots;
model px_last = date;
output out=pred p=predicted_value;
run;
proc sgplot data=pred;
series x=date y=predicted_Value;
scatter x=date y=px_last;
run;

Formating proc freq output table in SAS

I have the following data
DATA HAVE;
input year dz area;
cards;
2000 1 08
2000 1 06
2000 1 06
;
run;
proc freq data=have;
table area*dz / norow nocol;
run;
I get the following output
I would like to format it to put frequency in one column and percent in another column and I don't want the total column. Is there a way to do it?
Thank you!
Try adding the LIST option to get a different layout:
proc freq data=have;
table area*dz / norow nocol LIST;
run;
Pipe it to a data set and format as desired:
proc freq data=have;
table area*dz / norow nocol LIST out=want;
run;
proc print data=want;run;
Use PROC TABULATE instead (not shown), which allows you more control over your layout and formats.

Average number of rows per variable in SAS

I have the following dataset :
data test;
input business_ID $;
datalines;
'busi1'
'busi1'
'busi1'
'busi2'
'busi3'
'busi3'
;
run;
proc freq data = test ;
table business_ID;
run;
I would like the average nummber of lines per business, that is count the total number of observations and divide it by the number of distinct businesses.
In my example : 6 observations, 3 businesses -> 6/2=3 lines per business.
I was thinking about using a proc freq or a proc mean step but so far I got only the number of lines (~freq) per business and do not know how to get to my goal.
Any idea?
You could use PROC FREQ to get the counts and then run PROC MEANS on the output.
proc freq data=test ;
tables business_id / noprint out=counts ;
run;
proc means data=counts;
var count;
run;
Or you could count them directly with PROC SQL code.
proc sql ;
select count(*)/count(distinct business_id) as mean_count
from test
;
quit;

How do I put conditions around Proc Freq statements in SAS?

I have the following statement
Proc Freq data =test;
tables gender;
run;
I want this to generate an output based on a condition applied to the gender variable. For example - if count of gender greater than 2 then output.
How can I do this in SAS?
Thanks
If you mean an output dataset, you can put a where clause directly in the output dataset options.
Proc Freq data =sashelp.class;
tables sex/out=sex_freq(where=(count>9));
run;
I'm not aware of how you can accomplish this only using proc freq but you can redirect the output to a data set and then print the results.
proc freq data=test;
tables gender / noprint out=tmp;
run;
proc print data=tmp;
where count > 2;
run;
Alternatively you could use proc summary, but this still requires two steps.
proc summary data=test nway;
class gender;
output out=tmp(where=(_freq_ > 2));
run;
proc print data=tmp;
run;

How to construct histograms with unequal class widths in SAS?

I am trying to create histograms in sas with the help of proc univariate in sas. But it gives me histograms with equal class widths. Suppose i want to have a histogram with first class interval from 1 to 10 and second class interval from 10 to 100.
I tried using-
proc univariate data=sasdata1.dataone;
var sum;
histogram sum/ midpoints=0 to 10 by 10 10 to 100 by 90 ;run;
But this does not work. What is the correct way of doing this?
You can't do it with UNIVARIATE as far as I know, but any of the SGPLOT/GPLOT/etc. procedures will work; just bin your data into a categorical variable and VBAR that variable.
If you're okay with frequencies (not percents), this would work:
data test;
set sashelp.class;
do _t = 1 to floor(ranuni(7)*20);
age=age+floor(ranuni(7)*10);
output;
end;
run;
proc format;
value agerange
low-12 = "Pre-Teen"
13-14 = "Early Teen"
15-18 = "Teen"
19-21 = "Young Adult"
22-high = "Adult";
quit;
ods graphics on;
ods preferences;
proc sgplot data=test;
format age agerange.;
vbar age;
run;
I believe if you need percents, you'd want to PROC FREQ or TABULATE your data first and then SGPLOT (or GPLOT) the results.
I did find a macro that can be used to create histograms with unequal endpoints.
The code can be found in the NESUG 2008 proceedings