I estimate the following model:
proc surveyreg data = data;
cluster id;
model y = x1 x2 x3 x4;
run;
quit;
I want to test the following two hypotheses jointly: x1 = x3, x2 = x4.
I know how to do it in other regression procedures. For example, in proc panel I just do: "test x1 = x3, x2 = x4;" and it gives me the respective Wald Test statistic. However, I have trouble doing it (via use of the contrast statement?) in proc surveyreg.
I appreciate any help!
You might want to look at the documentation for the CONTRAST statement.
proc contents data=sashelp.fish varnum;
proc print data=sashelp.fish;
run;
proc glm /*surveyreg*/ data=sashelp.fish;
model width = Weight Length1 Length2 Length3 Height;
contrast 'length1=length2, length3=height' length1 1 length2 -1, length3 1 height -1 / e;
run;
proc reg data=sashelp.fish plots=none;
model width = Weight Length1 Length2 Length3 Height;
TEST1: Test length1=length2, length3=height;
run; quit;
Related
How to plot the dynamic of the variable y over x in SAS (the easiest way!), for example, temperature changes over time. Thank you very much! (PC-SAS)
Two common ways are the scatter and series statements in Proc SGPLOT.
For example:
data have;
do time = 0 to 24;
time = time + ranuni(123) - 1.2;
temp = 60 + 6 * sin(time / 7.6);
rowNumber + 1;
output;
end;
run;
ods graphics on / width=325px;
proc sgplot data=have;
title "Scatter";
footnote "Created: &sysdate";
scatter x=time y=temp;
run;
proc sgplot data=have;
title "Series";
footnote "Created: &sysdate";
series x=time y=temp / markers;
run;
I have a dataset like so
data test;
do i = 1 to 100;
x1 = ceil(ranuni(0) * 100);
x2 = floor(ranuni(0) * 1600);
x3 = ceil(ranuni(0) * 1500);
x4 = ceil(ranuni(0) * 1100);
x5 = floor(ranuni(0) * 10);
output;
end;
run;
data test_2;
set test;
if mod(x1,3) = 0 then x1 = .;
if mod(x2,13) = 0 then x2 = .;
if mod(x3,7) = 0 then x3 = .;
if mod(x4,6) = 0 then x4 = .;
if mod(x5,2) = 0 then x5 = .;
drop i;
run;
I plan to calculate a number of percentiles including two non-standard percentiles (2.5th and 97.5th). I do this using proc stdize as below
PROC STDIZE
DATA=test_2
OUT=_NULL_
NOMISS
PCTLMTD=ORD_STAT
pctldef=3
OUTSTAT=STDLONGPCTLS
pctlpts=(2.5 5 25 50 75 95 97.5);
VAR _NUMERIC_;
RUN;
Comparing to proc means
DATA TEST_MEANS;
SET TEST_2;
IF NOT MISSING(X1);
IF NOT MISSING(X2);
IF NOT MISSING(X3);
IF NOT MISSING(X4);
IF NOT MISSING(X5);
RUN;
PROC MEANS
DATA=TEST_MEANS NOPRINT;
VAR _NUMERIC_;
OUTPUT OUT=MEANSWIDEPCTLS P5= P25= P50= P75= P95= / AUTONAME;
RUN;
However, something to do with how SAS labels missing values as -inf, when I compare the results above, to the results produced in excel and proc means, they aren't aligned, can someone confirm which would be correct?
You are using pctldef=3 in PROC STDIZE but the default definition for PROC MEANS, and that is 5. I tested your code with PCTLDEF=3 using PROC MEANS and get matching results.
I have many datasets for many years from 2001 to 2014 which look like the following. Each year is stored in one file, yXXXX.sas7bdat,
ID Weight X1 X2 X3
1 100 1 2 4
2 300 4 3 4
and I need to create a dataset where for each year we have the (weighted) sums of each of the X columns.
X1 X2 X3 Year
10 20 30 2014
40 15 20 2013
I would be happy to implement this into a macro but I am unsure of a way to isolate column sums, and also an efficient way to attach results together (proc append?)
Edit: Including an attempt.
%macro final_dataset;
%do i = 2001 %to 2014;
/*Code here which enables me to get the column sums I am interested in.*/
proc means data = y&i;
weight = weight;
X1 = SUM X1;
X2 = SUM X2;
X3 = SUM X3;
OUTPUT OUT = sums&i;
run;
data final;
set final sums&i;
run;
%end;
%mend;
Edit: Another attempt.
%macro final_dataset;
%do i = 2001 %to 2014;
/*Code here which enables me to get the column sums I am interested in.*/
proc means data = y&i SUM;
weight = weight;
var X1 X2 X3;
OUTPUT OUT = sums&i;
run;
data final;
set final sums&i;
run;
%end;
%mend;
Edit: Final.
%macro final_dataset;
%do i = 2001 %to 2014;
/*Code here which enables me to get the column sums I am interested in.*/
proc means data = y&i SUM NOPRINT;
weight = weight;
var X1 X2 X3;
OUTPUT OUT = sums&i sum(X1 X2 X3) = X1 X2 X3;
run;
data final;
set final sums&i;
run;
%end;
%mend;
This is probably what I'd do, append all the data sets together and run one proc means. You didn't mention how big the data sets are, but I'm assuming smaller data.
data combined;
length source year $50.;
set y2001-y2014 indsname=source;
*you can tweak this variable so it looks how you want it to;
year=source;
run;
proc means data=combined noprint nway;
class year;
var x1 x2 x3;
output out=want sum= ;
run;
I have a person level dataset with three categorical variables V1, V2 and V3. I want to use Proc Tabulate to calculate means of variable X1, X2, and X3 by the three categories listed above as well as a count of persons and the percentage out of V1 and V2 (i.e when V3 is all).
Here is my first attempt.
Proc tabulate date = in_data
Out = out_data;
Var X1 X2 X3;
Class V1 V2 V3;
Table (V1 all) * (V2 all) * (V3 all), N mean pctn<V1 V2>;
Run;
This gives me the error message “Statistics other than N was requested without analysis variable in the following nesting V1*V2*V3*Mean”. I don’t think I have the syntax quite right. Any ideas on how I can fix it? Thanks.
You need to include the variables in the table statement. I think this should work:
Proc tabulate date = in_data
Out = out_data;
Var X1 X2 X3;
Class V1 V2 V3;
Table (V1 all) * (V2 all) * (V3 all), (X1 X2 X3)*(N mean);
Run;
This works for me:
Proc tabulate data = sashelp.class
Out = out_data;
Var age weight height;
Class sex;
Table (sex all), (age weight height)*( N mean);
Run;
EDIT:
Your issue is specific to your data somehow, you'll have to include a sample data or there's something else going on.
Here's a reproduction with a value of 0's and no issues in the summary.
data have;
do i=1 to 1000;
v1=rand('bernoulli', 0.4);
v2=rand('bernoulli', 0.7);
x1=rand('uniform')*3+1;
x2=rand('uniform')*9+1;
output;
end;
drop i;
run;
proc print data=have(obs=10);
run;
proc tabulate data=have out=check;
class v1 v2;
var x1 x2;
table (v1 all) (v2 all), (x1 x2)*(n mean);
run;
SAS newbie here.
My question is about PROC REG in SAS; let's assume I have already created a model and now I would like to use this model, and known predictor variables to estimate a response value.
Is there a clean and easy way of doing this in SAS? So far I've been manually grabbing the intercept and the coefficients from the output of my model to calculate the response variable but as you can imagine it can get pretty nasty when you have a lot of covariates. Their user's guide is pretty cryptic...
Thanks in advance.
#Reese is correct. Here is some sample code to get you up the learning curve faster:
/*Data to regress*/
data test;
do i=1 to 100;
x1 = rannor(123);
x2 = rannor(123)*2 + 1;
y = 1*x1 + 2*x2 + 4*rannor(123);
output;
end;
run;
/*Data to score*/
data to_score;
_model_ = "Y_on_X";
y = .;
x1 = 1.5;
x2 = -1;
run;
/*Method 1: just put missing values on the input data set and
PROC REG will do it for you*/
data test_2;
set test to_score;
run;
proc reg data=test_2 alpha=.01 outest=est;
Y_on_X: model y = x1 x2;
output out=test2_out(where=(y=.)) p=predicted ucl=UCL_Pred lcl=LCL_Pred;
run;
quit;
proc print data=test2_out;
run;
/*Method 2: Use the coefficients and the to_score data with
PROC SCORE*/
proc score data=to_score score=est out=scored type=parms;
var x1 x2;
run;
proc print data=scored;
var Y_on_X X1 X2;
run;
2 ways:
Append the data you want into the data set you're going to use to get estimates but leave the y value blank. Grab the estimates using the output statement from proc reg.
Use Proc Score
http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_score_sect018.htm