a very simple question...
I'm trying to plot some data with sgplot. On the x-axis I should have: t0, t6, t12, t18 but sas orders as follows: t0, t12, t18, t6.
I tried to manually set the desired order and I also tried to sort the data by the column containing t* labels following the desired order but nothing happens. Can anyone help me to solve this issue?
Thank you in advance
Use the xaxis values= statement to specify your order. For example:
data have;
input t$ value;
datalines;
t0 1
t12 10
t18 30
t6 4
;
run;
proc sgplot data=have;
vbar t / response=value;
xaxis values=('t0' 't6' 't12' 't18');
run;
If you have a lot of these, you can read the order all into a macro variable by extracting the numeric part of each value of t, then convert it to a numeric variable for sorting.
data set_order;
set have;
/* Get only the numeric part */
t_order = input(compress(t,,'A'), 8.);
run;
proc sort data=set_order;
by t_order;
run;
proc sql noprint;
select quote(t)
into :t separated by ' '
from set_order;
quit;
proc sgplot data=have;
vbar t / response=value;
xaxis values=(&t.);
run;
Related
From annual data :
I would like to create the the data per day but I can't use the proc expand because the SAS ETS is not available.
Thank you for your suggestions.
Something like this is a basic approach perhaps:
create a list of dates for interpolation
merge have data (shown above, not included in code below)
Plot to see if linear pattern, (looks somewhat exponential/curved)
run linear regression, saving predicted values
plot interpolated values against actual values
data years;
do date='30Jun2017'd to '30Jun2022'd;
output;
end;
run;
data have;
merge years have;
by date;
format date date9.;
run;
proc sgplot data=have;
series x=date y=px_last;
run;
proc reg data=have plots;
model px_last = date;
output out=pred p=predicted_value;
run;
proc sgplot data=pred;
series x=date y=predicted_Value;
scatter x=date y=px_last;
run;
I want to create a line graph that includes the overall trend of a disease rate and the specific trends for males and females. I use the following code for to create the group by trends. How to add he average trend to this line graph. Thanks for your help.
proc sgplot data=have ;
vline year/response=disease_rate group=sex stat=mean datalabel=disease_rate ;
yaxis values=(0,1) label="Percentage";
run;
Here's an example of summarizing it and then displaying it on the graph. There are more than one way to do this though, this is just one.
data have;
set sashelp.heart(in=a);
year=round(2021-ageAtStart, 10);
disease_rate= status="Dead";
run;
proc means data=have mean noprint;
class sex year;
types sex sex*year;
var disease_rate;
output out=summary_stats mean=average_value;
run;
proc sort data=summary_stats;
by sex year;
run;
data graph_data;
merge summary_stats(where=(_type_=2) rename=average_value=mean_sex_year)
summary_stats(where=(_type_=3) rename=average_value = mean_sex);
by sex;
format mean_sex: percent12.1;
run;
proc sgplot data=graph_data ;
*where year > 1990;
vline year/response=mean_sex_year group=sex stat=mean datalabel=mean_sex_year ;
vline year/response=mean_sex group=sex stat=mean datalabel=mean_sex ;
run;
Use series instead of vline so that you can overlay a regression on top of it to get an average trend line. For example:
proc sql;
create table have as
select date
, region
, sum(sale) as sale
from sashelp.pricedata
group by region, date
order by region, date
;
quit;
proc sgplot data=have;
series x=date y=sale / group=region;
reg x=date y=sale / group=region;
xaxis fitpolicy=rotatethin;
run;
I have the following statement
Proc Freq data =test;
tables gender;
run;
I want this to generate an output based on a condition applied to the gender variable. For example - if count of gender greater than 2 then output.
How can I do this in SAS?
Thanks
If you mean an output dataset, you can put a where clause directly in the output dataset options.
Proc Freq data =sashelp.class;
tables sex/out=sex_freq(where=(count>9));
run;
I'm not aware of how you can accomplish this only using proc freq but you can redirect the output to a data set and then print the results.
proc freq data=test;
tables gender / noprint out=tmp;
run;
proc print data=tmp;
where count > 2;
run;
Alternatively you could use proc summary, but this still requires two steps.
proc summary data=test nway;
class gender;
output out=tmp(where=(_freq_ > 2));
run;
proc print data=tmp;
run;
I have the following code
data work.customBins;
retain fmtname 'bins' type 'n';
do binStart=-2.5 to 2.45 by 0.05;
binEnd=binStart+0.05;
difference=cat(binStart," to ",binEnd);
output;
end;
run;
proc format library=work cntlin=work.customBins; run;
proc freq data=work.myData;
table variable /missing;
format variable bins.;
run;
This code works properly everything is fine my only issue is If I have bins for example -1.45 to -1.40 that dont have any values proc freq disregards them. I want the cumulative frequency of the pervious bin to be displayed in the bins that have no values for example
-1.50 to -.145 cumulative Freq = 2%
-.1.45 to -1.4 has no values but the cumulative Freq for this should be 2%
I have also tried doing this
data work.combined;
set work.myData (in=a) work.customBins (in=b)
if a then cont=1;
if b then cont=0;
run;
proc freq data=work.combined;
table variable /missing;
format variable bins.;
weight cont/zeros;
run;
But this also does not work
myData just contains a single variabrle called variable which is decimal numbers in the range of -2.45 to 2.45
Here is a working variant:
data work.customBins;
do binStart=-2.5 to 2.45 by 0.05;
binEnd=binStart+0.05;
difference=cat(binStart," to ",binEnd);
output;
end;
run;
proc sql;
create table want as
select difference, count(variable) as count
from customBins left join mydata
on binStart < variable <= binEnd
group by difference
order by binStart;
quit;
proc freq data=want order=data;
tables difference;
weight count / zeros;
run;
Regarding your first variant. Are you sure that your PROC FORMAT works as expected? Dataset used in CNTLIN-option should have variables START, END and LABEL, not voluntarily named ones. Anyway, it wouldn't work because PROC FREQ uses only values that you do have in mydata dataset, doesn't matter how many other labels you defined in your format.
I am trying to create histograms in sas with the help of proc univariate in sas. But it gives me histograms with equal class widths. Suppose i want to have a histogram with first class interval from 1 to 10 and second class interval from 10 to 100.
I tried using-
proc univariate data=sasdata1.dataone;
var sum;
histogram sum/ midpoints=0 to 10 by 10 10 to 100 by 90 ;run;
But this does not work. What is the correct way of doing this?
You can't do it with UNIVARIATE as far as I know, but any of the SGPLOT/GPLOT/etc. procedures will work; just bin your data into a categorical variable and VBAR that variable.
If you're okay with frequencies (not percents), this would work:
data test;
set sashelp.class;
do _t = 1 to floor(ranuni(7)*20);
age=age+floor(ranuni(7)*10);
output;
end;
run;
proc format;
value agerange
low-12 = "Pre-Teen"
13-14 = "Early Teen"
15-18 = "Teen"
19-21 = "Young Adult"
22-high = "Adult";
quit;
ods graphics on;
ods preferences;
proc sgplot data=test;
format age agerange.;
vbar age;
run;
I believe if you need percents, you'd want to PROC FREQ or TABULATE your data first and then SGPLOT (or GPLOT) the results.
I did find a macro that can be used to create histograms with unequal endpoints.
The code can be found in the NESUG 2008 proceedings