Performing PCA on numeric values in SAS - sas

I am performing PCA on the numeric columns present in my dataset.These numeric columns are in the same range so still do i need to standardize( Using PROC STANDARD) those columns before performing PCA on them?

You do not need to standardize them before running proc princomp. Add the std option to produce standardized principal component scores.
proc princomp data=mydata out=scores std;
var var1 var2 var3;
run;

Related

SAS: Using Weight statement in a Proc Freq command error

In SAS (through WPS Workbench), I am trying to get some frequency counts on my data using the popn field (populations as integers) as a weight.
proc freq data= working.PC_pops noprint;
by District;
weight popn / zeros;
tables AreaType / out= _AreaType;
run;
However, when I run the code above, I am getting the following error pointing to my Weight statement:
ERROR: Found "/" when expecting ;
ERROR: Statement "/" is not valid
I have checked the syntax online and to include zero counts within my weighting, it definitely says to use the "/ zeros" option within the Weight statement, but SAS (WPS) is erroring? What am I doing wrong?
UPDATE: I have now discovered that the zeros option is not supported through WPS Workbench. Is there a workaround to this?
Given you're not using any of the advanced elements of PROC FREQ (the statistical tests), you may be better off using PROC TABULATE. That will allow you to define exactly what levels you want in your output, even if they have zero elements, using a few different methods. Here's a bit of a hacky solution, but it works (at least in SAS 9.4):
data class;
set sashelp.class;
weight=1;
if age=15 then weight=0;
run;
proc freq data=class;
weight weight/zeros;
tables age;
run;
proc tabulate data=class;
class age;
var weight;
weight weight; *note this is WEIGHT, but does not act like weight in PROC FREQ, so we have to hack it a bit by using it as an analysis variable which is annoying;
tables age,sumwgt='Count'*weight=' '*f=2.0;
run;
Both give the identical result. You can also use a CLASSDATA set, which is a bit less hacky but I'm not sure how well it's supported in non-SAS:
proc sort data=class out=class_classdata(keep=age) nodupkey;
by age;
run;
proc tabulate data=class classdata=class_classdata;
class age;
freq weight; *note this is FREQ not WEIGHT;
tables age,n*f=2.0/misstext='0';
run;

Is it possible to do a paired t-test is SAS using panel data?

I am working with paneldata that looks something like this:
I am going to perform a t-test in SAS 9.4 to find out if there is a significant change in var1 from 2014 to 2016, and I am assuming that I have to use a paired t-test, since I have several an observation in both 2014 and 2016 for each individual (ID).
My question is, can this be done in SAS, when I am using panel data like the one I have shown? Or do I need to create a a wide dataset with one variable containing the data from 2014 and one variable containing the data from 2016? I know that I have to do that in STATA, but maybe I don't have to change my entire dataset to do this in SAS?
You will have to transpose your data to to a paired t-test. You can use PROC TRANSPOSE though.
*sort for transpose;
proc sort data=have; by id year; run;
*reformat from long to wide;
proc transpose data=have out=want prefix=Year_;
by ID;
ID Year;
Var Var1;
run;
*Paired T-Test;
proc ttest data=want;
paired Year_2014*Year_2016;
run;
PS. Please include your data as text not an image in the future. We cannot write code off an image and I'm not typing out your data, so at present this is untested but should work.

Normalize a variable (divide by its total)

I have a variable of weight, wprm, that takes integer values. I would like to have one that is the weight "normalized", that is to say wprm/sum(wprm)
I can do that by outputing a proc summary ant then a merge to put it back with the original data, and then dividing my wprm variable, but it seems a bit heavy, is there a simpler way ?
Use PROC STDIZE or PROC STANDARD - they both allow various normalization methods.
proc stdize data=have method=sum out=want;
var wprm;
run;
You can grab the macro %simple_normalize from here.
data test;
do i=1 to 10;
output;
end;
run;
%simple_normalize(test,i);
The other common option is SQL, but it will post a warning/note to the log that many people don't like.
proc sql;
create table want as
select a.*, a.wprm/sum(a.wprm) as weight
from have;
quit;

Rank-ordering output from proc logistic

In the SAS proc logistic output, is there a way to rank-order by decreasing value of Wald's Chi square statistic?
Here's a link to an example of using ODS to write procedure output to a SAS dataset. The example uses proc genmod, but the concept applies to proc logistic as well. The documentation for proc logistic contains all of the ODS table names produced by the proc. Find the one you are interested in, write it out to a data set, and sort it however you desire.

error message box cox in proc transreg procedure in sas

I try to use the proc transreg procedure in SAS, to transform one of my variables in a dataset (var1). The var1 variable has values >=0.
My code is:
proc transreg data=data1 details;
model boxcox(var1/lambda=-1 to 1 by 0.125 convenient parameter=1)=identity(var2);
output out=BoxCox_Out;
run;
However I get the following error message:
"observation of nonblank TYPE not equal 'Score ' are excluded from the analysis and the output data set.
Could anyone help me?
_TYPE_ can be used for TRANSREG to allow you to take datasets with multiple kinds of rows and only use the SCORE rows (or whichever ones you choose), often outputs from earlier TRANSREG procedures.
However, _TYPE_ is also a common variable added by procedures like PROC MEANS to indicate which class combinations apply to the row. In this case, TRANSREG is getting confused and thinking you want something different.
Drop the _TYPE_ variable in the TRANSREG data source statement, and it should use all rows.
proc transreg data=data1(drop=_type_) details;