I am doing some simple cross tabulations using Proc Freq, but I'm noticing that the output SAS gives me doesn't contain any frequency counts; I'm only getting percents.
Here is an example code that I ran in SAS (I am using SAS 9.4):
data test;
input year 1-5 group $6;
cards;
2018 A
2018 A
2018 B
2018 B
2019 A
2019 A
2019 A
2019 B
;
run;
proc freq data = test;
table year * group / norow nopercent;
run;
I'm expecting a table that has the frequency counts with the column percentage below, but instead, this is what SAS is giving me:
Does anyone know how I can get the frequency values to be shown?
I ran your code and got this. I reckon there is something you are not telling us.
Thank you all for your help- I found the issue. It looks like there was an issue with the cross-tab frequency template that came with SAS. I was able to restore it by using the following code:
proc template;
delete base.freq.crosstabfreqs;
run;
Thank you all for your help!
#_null_ your image is NOT the output I get when running the questions code.
The Frequency and Col Pct are NOT in row header cells, and instead are shown in a box offset to the left from the table.
Related
I have a date time variable 'chg_date_of_svc' and would like to make this variable a month_year variable. To do this, I simply wrote the follow code:
data combined1;
set combined;
MONTH_YEAR=chg_date_of_svc;
format MONTH_YEAR monyy7.;
run;
I would then like to use the month_year variable in a proc freq statement; however, the month_years do not appear in chronological order when using the following code. For example, January 2019 appears before December 2018 in the tables the proc freq statement produces.
This may not be the easiest solution but I suspect I have to relabel the specific year_months so they appear in the the correct chronological order?
proc freq data = combined1 order=data;
table EM_Charge*MONTH_YEAR;
run;
Thank you for the help.
You requested that it list of columns in the order that they first appear in the input dataset. If you want them in chronological order then remove the ORDER=DATA option. If you must use ORDER=DATA then sort the data first.
I am working with paneldata that looks something like this:
I am going to perform a t-test in SAS 9.4 to find out if there is a significant change in var1 from 2014 to 2016, and I am assuming that I have to use a paired t-test, since I have several an observation in both 2014 and 2016 for each individual (ID).
My question is, can this be done in SAS, when I am using panel data like the one I have shown? Or do I need to create a a wide dataset with one variable containing the data from 2014 and one variable containing the data from 2016? I know that I have to do that in STATA, but maybe I don't have to change my entire dataset to do this in SAS?
You will have to transpose your data to to a paired t-test. You can use PROC TRANSPOSE though.
*sort for transpose;
proc sort data=have; by id year; run;
*reformat from long to wide;
proc transpose data=have out=want prefix=Year_;
by ID;
ID Year;
Var Var1;
run;
*Paired T-Test;
proc ttest data=want;
paired Year_2014*Year_2016;
run;
PS. Please include your data as text not an image in the future. We cannot write code off an image and I'm not typing out your data, so at present this is untested but should work.
Hi I have a time series data table from October 2013 to October 2016. I would like to plot the time series from October 2013 to November 2014, October 2014 to November 2015, and October 2015 to November 2016 on top of each other on the same graph to analyze any seasonal trends.
My idea is to create separate data tables with each subsegment, but is there an easier way to do this in SAS?
This is an example of the data table I want to plot the seasonality of.
The workflow I think here is to add a group variable that indicates, say, year, which has the same value for all rows you want plotted in one plot-grouping.
Then you use the group statement in whatever plot type you want. Something like:
data stocks_years;
set sashelp.stocks;
date_year = intck('YEAR','01AUG1986'd,date,'c')+1986;
date_month= month(date);
run;
proc sgplot data=stocks_years;
vline date_month/response=close group=date_year stat=mean;
run;
This is an example of doing that to see the average close per month of the three stocks in the SASHELP.STOCKS dataset. It is a terrible plot of course but it should give you some idea of what it would look like. Each of those differently colored lines is from a different year (aug->jul being defined as a year, with the number being the year number of aug).
The lead off provided by Joe gave me everything I needed. Here is the completed code for anyone else's reference.
%macro Plot_Seasonal_Worse_TP(tbl_name, tp, cutoff_date);
/*tp = transition probability */
proc sql;
create table &tbl_name._trim as
SELECT *
FROM &tbl_name
WHERE asofdt > &cutoff_date;
run;
data &tbl_name._trim;
set &tbl_name._trim;
date_year = intck('YEAR','01NOV2013'd,asofdt,'c')+2014;
date_month= MOD(month(asofdt)+2, 12); /* move november and december of previous year to front of time series */
run;
proc sgplot data=&tbl_name._trim;
vline date_month/response=&tp group=date_year;
title &tbl_name (&tp);
run;
%mend Plot_Seasonal_Worse_TP;
Output looks like this as well.
First question, is it possible to produce a boxplot using proc gchart in SAS?
If it is possible, please give me a brief idea.
Or else, on the topic of using proc boxplot.
Suppose I have a dataset that has three variables ID score year;
something like,
data aaa;
input id score year;
datalines;
1 50 2008
1 40 2007
2 30 2008
2 20 2007
;
run;
I want to produce a boxplot showing for each ID in each year. (So in this case, 4 boxplots in a single plot)
How can i achieve this?
I have tried using
proc boxplot data=aaa;
plot score*ID;
by year;
run;
However, this is not working as we can see year is not sorted by order.
Is there a way to get other this?
You need to sort your input dataset first. Run this
proc sort data = aaa;
by year;
run;
and then your proc boxplot should work as written.
This is quite easy to do with sgplot, which is part of the newer ODS Graphics suite which is available in base SAS.
proc sgplot data=sashelp.cars;
vbox mpg_city/category=type group=origin grouporder=ascending;
run;
You would use category=id and group=year in your example data - you get one separate tick on the x axis for each category and then you get a separate bar clustered together for each group.
Let's say I have 50 years of data for each day and month. I also have a column which lists the max rainfall for each day of that dataset. I want to be able to compute the average monthly rainfall and standard deviation for each of those 50 years. How would I accomplish this task? I've considered using PROC MEANS:
PROC MEANS DATA = WORK.rainfall;
BY DATE;
VAR AVG(max_rainfall);
RUN;
but I'm unfamiliar on how to let SAS understand that I want to be using the MM of the MMDDYY format to indicate where to start and stop calculating those averages for each month. I also do not know how I can tell SAS within this PROC MEANS statement on how to format the data correctly, using MMDDYY10. This is why my code fails.
Update: I've also tried using this statement,
proc sql;
create table new as
select date,count(max_rainfall) as rainfall
from WORK.rainfall
group by date;
create table average as
select year(date) as year,month(date) as month,avg(rainfall) as avg
from new
group by year,month;
quit;
but that doesnt solve the problem either, unfortunately. It gives me the wrong values, although it does create a table. Where in my code could I have gone wrong? Am I telling SAS correctly that add all the rainfall's in 30 days and then divide it by the number of days for each month? Here's a snippet of my table.
You can use a format to group the dates for you. But you should use a CLASS statement instead of a BY statement. Here is an example using the dataset SASHELP.STOCKS.
proc means data=sashelp.stocks nway;
where date between '01JAN2005'd and '31DEC2005'd ;
class date ;
format date yymon. ;
var close ;
run;