SAS proc Freq & gchart display additional value's frequency/ bars - sas

This might be a weird question. I have a data set contains data like agree, neutral, disagree...for many questions. There is not so many observations so for some question, one or more options has frequency of 0, say neutral. When I run proc freq, since neutral shows up in that variable, the table does not contain a row for neutral. I end up with tables with different number of rows. I would like to know if there is a option to show these 0 frequency rows. I will also need to run proc gchart for the same data set, and I will run into the same problem for having different number of bars. Please help me on this. Thank you!

This depends on how exactly you are running your PROC FREQ. It has the sparse option, which tells it to create a value for every logical cell on the table when creating an output dataset; normally, while you would have a cell with a missing value (or zero) in a crosstab, if that is output to a dataset (which is vertical, ie each combination of x and y axis value are placed in one row) those rows are left off. Sparse makes sure that doesn't happen; and in a larger (n-dimensional) crosstab, it creates rows for every possible combination of every variable, even ones that don't occur in the data.
However, if you're just doing
proc freq data=mydata;
tables myvar;
run;
That won't help you, as SAS doesn't really have anything to go on to figure out what should be there.
For that, you have to use a class variable procedure. Proc Tabulate is one of such procedures, and is similar to Proc Freq in its syntax (sort of). You need to either use CLASSDATA on the proc statement, or PRINTMISS on the table statement. In the former case, you do not need to use a format, I don't believe. In the latter case (PRINTMISS), you need to create a format for your variable (if you don't already have one) that contains all levels of the data that you want to display (even if it's just an identity format, e.g. formatting character strings to identical character strings), and specify PRELOADFMT on the proc statement. See this man page for more details.

Related

Dealing with missing values in SAS?

I am handling a SAS dataset with little observations and I need to put on relation a variable with its lag.
By doing this, I lose a record resulting in a missing values.
Do some of you know how SAS Base handles such items in the procedures as PROC REG or PROC CORR?
Thanks you all in advance.
It depends a bit on what exactly you're doing. Based on your references to PROC REG and CORR - it excludes the missing values - listwise. This means if any value is missing in a row that is being used or referenced in the PROC it will be excluded.
http://documentation.sas.com/?docsetId=lrcon&docsetTarget=p175x77t7k6kggn1io94yedqagl3.htm&docsetVersion=9.4&locale=en
PROC CORR has several options that allow you to specify how it handles the missing values. The NOMISS option tells SAS to use only complete cases.
https://blogs.sas.com/content/iml/2012/01/13/missing-values-pairwise-corr.html

Missing values in a FREQ (SAS)

I'm going to ask this with an example...
Suppose i have a data set where each observation represents a person. Two of the variables are AGE and HASADOG (and say this has values 1 for yes and 2 for no.) Is there a way to run a PROC FREQ (by AGE*HASADOG) that forces SAS to include in the report a line for instances where the count is zero?
By this I mean: if there is a particular value for AGE such that no observation with this AGE value has a 1 in the HASADOG variable, the report will still include a row for this combination (with a row percent of 0.)
Is this possible?
The SPARSE option in PROC FREQ is likely all you need.
proc freq data=sashelp.class;
table sex*age / sparse list;
run;
If the value is nowhere in your data set at all, then there's no way for SAS to know it exists. In this case you'd need a more complex solution, basically a way to tell SAS all values you would be using ahead of time. This can be done via a PRELOADFMT or CLASSDATA option on several procs. There are asked an answered questions on this topic here on SO, so I won't provide a solution for this option, which seems beyond the scope of your question.

SAS Proc Tabulate output dataset variable names

I am using proc tabulate to create an output dataset with statistics (n mean std min max p25 p75 median) for a variable with a long name (close to the 32 character maximum). The output dataset will add _n, _std, etc to our variable name, but the median variable is just named "Median" because the variable name with "_median" added to the end, the resulting variable name would be >32 characters.
Is there a way to specify the name of the variables in the output dataset from within the proc tabulate step? I am looping through 1000s of variable for this procedure, so it's not feasible to rename each variable in a data step. Also, it must be proc tabulate and not proc freq because we need to output a row for every possible value of each variable, not just those values that exist in the data.
proc tabulate data=DATA out=OUT ;
var VERY_LONG_VARIABLE_NAME;
table VERY_LONG_VARIABLE_NAME *(n mean std min max p25 p75 median)/printmiss;
run;
Unfortunately I don't know of a way to override the tabulate names. Even transposing the tabulate doesn't fix that - you still get the same result, sadly.
My suggestion is to use a different proc. Almost all of the procs you might use have a way to get what you want - the PRINTMISS equivalent; for example, PROC FREQ has the SPARSE option which does basically the same thing (despite its odd name), and PROC SUMMARY or PROC MEANS might be even better (with COMPLETETYPES on the class statement), just depending on your data.
Alternately, you could reshape your data, or reshape your process. For example, if you're really looping through thousands of variables, that's horribly inefficient; better would be to reshape to variable|value structure (vertical) and then do one proc tabulate; that would fix your issue right there (as it would make 'varname' be a CLASS or BY variable itself not a contributor to the output variable name) and make your process faster.
You could also add a VIEW step before the tabulate that performs the rename for you; that would cost very little even in a macro loop.
Either way, supply some sample data and an example of the total process you're doing and likely you can get a better answer.

Interleaving output from two different procedures with a by value

I have a large SAS dataset and I would like to make a series of tables and charts using by value processing. I am outputing these to a PDF.
Is there any way to get SAS to alternate between the table and the chart as it goes through the data? Right now, I have to print all of the tables first and then print the charts. If it were just 4 tables/charts, then I would be ok writing
Here is a simple example:
data sample;
input byval $ item $ amount;
datalines;
A X 15
A Y 16
A Z 12
B X 25
B Y 10
B Z 18
;
run;
symbol1 i=j;
proc print data=sample;
by byval;
var item amount;
run;
proc gplot uniform data=sample;
by byval;
plot amount*item;
run;
This prints 2 tables, followed by 2 charts.
I would like the Chart for "A" to come after the table for "A" so that the reader can flip through the pdf and always see the associated charts and tables together.
I could write separate procs for each one, but then the gplot won't have a uniform axis (and it gets messy if I have 100 different groups instead of 2).
I thought about pumping them into greplay but then you can't use titles with "#BYVAL1".
Is there any easy way to do this?
I've never used it, but it may be worth checking out ODS DOCUMENT. This allows you to store the output of all your procedures and then reference specific items from them using PROC DOCUMENT.
Below is a link to the SAS website with useful information about this, in particular the paper by Cynthia Zender for the SAS Global Forum 2009.
http://support.sas.com/rnd/base/ods/odsdocument/index.html
Cynthia also regularly contributes to the SAS Support Communities website (https://communities.sas.com/community/support-communities), so it may be worth asking on there if you are still stuck.
Good luck
I don't know of any way to do what you ask directly. GREPLAY is probably the closest you'll come; the primary problem is that SAS processes the PROCs linearly, first processing the entire PROC PRINT, then the entire PROC GPLOT. GREPLAY would allow you to redisplay the output, but if that doesn't work for your needs due to the #BYVAL issue, I'm not sure there's a better solution. Perhaps you can modify the title afterwards (not sure if GREPLAY allows this)?
You could try using ODS LAYOUT, but I don't think that would be any better. The one way it could be better is if you can work out having two columns on a 'page', one column being the PROC PRINT outputs, one the PROC GPLOT, and then print the columns one page than the other. I don't think this is possible, but it might be worth exploring.
You might also try setting up a macro to do each BYVAL separately, defining the axis in a uniform manner manually (ie, defining it based on your own calculation of the correct axis parameters, as an argument to the macro). That is probably the easiest solution that might still allow #BYVAL to work properly.
You might also try browsing about Richard DeVenezia's site (http://www.devenezia.com/downloads/sas/samples/ ) which has a lot of examples of SAS/GRAPH solutions. He also posts on SAS-L (sasl#listserv.uga.edu) sometimes, not sure if I've seen him on StackOverflow. He's probably the person most likely to be able to answer the question that I know of.

Compare in SAS proc

First off, I know pretty much nothing about SAS and I am not a programmer but an accountant, but here it goes:
I am trying to compare two data sets to identify differences between them, so I am using the 'proc compare' command as follows:
proc compare data=table1 compare=table2
criterion=.01;
run;
This works fine, but it compares line by line and in order, so if table2 is missing a row half way through, then all entries after that row will be returned as not equal.
How do I ask the comparison to be made based on a variable so that the proc compare finds the value associated with variable X in table 1, and then makes sure that the same variable X in table 2 has the same value?
The ID statement in PROC COMPARE is used to match rows. This code may work for you:
proc compare data=table1 compare=table2 criterion=.01;
id X;
run;
You may need to use PROC SORT to sort the data by X before doing the PROC COMPARE. Refer to the PROC COMPARE documentation for details on the ID statement to determine if you should sort or not.
Here is a link to the PROC COMPARE documentation:
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/a000057814.htm