Plot of observation number - sas

I want to make a plot between a variable and its observation number. From here http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_reg_sect017.htm
I see that the keyword is OBS.
so thus my code is
proc gplot data=my_data;
plot OBS.*my_v;
run;
by I still get an error. What's the proper way to do this?

Your reference refers to proc reg not proc gplot. You may need to add the observation number in to the data set.
data my_data;
set my_data;
obs=_n_;
run;
proc gplot data=my_data;
plot OBS*my_v;
run;

Related

Required ordering for statements and options within SAS procedures

In many cases, one can choose any order for statements and options within SAS procedures.
For instance, as far as statements' order is concerned, the two following
PROC FREQ, in which the order of the BY and the TABLES statements is interverted,
are equivalent:
PROC SORT DATA=SASHELP.CLASS OUT=class;
BY Sex;
RUN;
PROC FREQ DATA=class;
BY Sex;
TABLES Age;
RUN;
PROC FREQ DATA=class;
TABLES Age;
BY Sex;
RUN;
In a similar way, as far as options' order is concerned, the two following PROC PRINT, in which the order of the OBS= and the FIRSTOBS= options is interverted, are equivalent:
PROC PRINT DATA=SASHELP.CLASS (FIRSTOBS=2 OBS=5);
RUN;
PROC PRINT DATA=SASHELP.CLASS (OBS=5 FIRSTOBS=2 OBS=5);
RUN;
But there is some exceptions.
For instance, as far as options' order is concerned, among the two following PROC PRINT, in which the location of the NOOBS option is different, the second PROC PRINT, where the NOOBS option is preceding the parentheses, results in an error while the first PROC PRINT is correct:
PROC PRINT DATA=SASHELP.CLASS (FIRSTOBS=2 OBS=5) NOOBS;
RUN;
PROC PRINT DATA=SASHELP.CLASS NOOBS (FIRSTOBS=2 OBS=5);
RUN;
Similarly, as far as statements' order is concerned, I occasionally met cases where a certain statement must be placed before other(s) statement(s) - but, unfortunately, I don't remember in which procedure (probably a statistical one, for duration or multilevel models).
While the ordering question within data steps might be seen as a completely different question, because within data steps the statements' order is most of the time a matter of logic, the way of ordering some statements looks like being partly a matter of conventional ordering, as within procedures; it is for instance the case in the following merging procedure, where the MERGE statement must precede the BY statement; but I suppose that SAS could have been designed to understand these statements in any order:
/* to get a simple example of merge I start with artificially cutting the Class dataset in two parts */
PROC SORT DATA=SASHELP.CLASS OUT=class;
BY Name;
RUN;
DATA sex_and_age;
SET class (KEEP=Name Sex Age);
RUN;
DATA height_and_weight;
SET class (KEEP=Name Height Weight);
RUN;
DATA all_variables;
MERGE sex_and_age height_and_weight;
BY Name;
RUN;
Because I am unable to find out such a guide, my question is: does it exist a text devoted to the question of the required order for statements and options within SAS procedures?
Joel,
Let me address the NOOBS example to help clarify. The 2 statements:
PROC PRINT DATA=SASHELP.CLASS (FIRSTOBS=2 OBS=5) NOOBS;
PROC PRINT DATA=SASHELP.CLASS NOOBS (FIRSTOBS=2 OBS=5);
Those are dataset options and they affect the read of the dataset. There are a number of them, including KEEP, DROP, WHERE, etc. NOOBS is not a dataset so you get an error. Dataset options are subsequent to the dataset name.
The order of statements, in many cases, is important because it sets the PDV (program data vector). Hence, why an ATTRIB should be at the top of a data step. Some procs, it doesn't matter since they will all be combined for execution.
data test;
attrib myNewVar length=$8 format=$20.
myNewVar2 format=date.
;
set sashelp.class;
myNewVar = 'Hey Joel!';
myNewVar2 = '24FEB2020'd;
run;
A parenthetical list of name=value pairs after a data set specifier are known as data set options. Thus you need to be able to anticipate what the SAS submit parser will be doing.
* (...) applies to SASHELP.CLASS;
PROC PRINT DATA=SASHELP.CLASS (FIRSTOBS=2 OBS=5);
* (...) are where a option name or options name=value is expected -- error ensues;
PROC PRINT DATA=SASHELP.CLASS NOOBS (FIRSTOBS=2 OBS=5);
* (...) applies to SASHELP.CLASS, NOOBS is in a proper option location within the PROC statement;
PROC PRINT NOOBS DATA=SASHELP.CLASS (FIRSTOBS=2 OBS=5);
Any special statement ordering is found in the PROC documentation. Some procs have common syntax and documentation will redirect you.
Your first point appears to be caused by not understanding what dataset options are. Otherwise order of optional parts of statement (like PROC PRINT) will be specified in the documentation for that statement.
To the second point it appears you are confusing the purpose of the BY statement in a PROC and the BY statement in a data step. In a PROC step the BY statement tells it to process the data in groups. In a DATA step the BY statement must be linked to a specific MERGE/SET/UPDATE statement.

proc sgpanel reg only on one modality of group

I have some computing problem with SAS 9.2.
Here is my code:
data test;
set sashelp.cars;
WHERE Make in ('Acura', 'Audi');
run
proc sgpanel data=test;
panelby origin;
reg y=Weight x=Length/group=Make;
run;
The regression option is applied on the 2 modalities of the variable Make.
Is it possible to allpy it only one one of them? For instance, only on the Make 'Audi'?
Thanks a lot.
Just make two separate plots with sgplot.
/* regression plot for the one you need the regression */
proc sgplot data=sashelp.cars;
WHERE Make eq 'Audi';
reg y=Weight x=Length/group=Make;
run;
/* scatter plot for the one you need no regression */
proc sgplot data=sashelp.cars;
WHERE Make eq 'Acura';
scatter y=Weight x=Length/group=Make;
run;
Also not I integrated the selection in the graphical procedure.

Exporting SAS data into SPSS with value labels

I have a simple data table in SAS, where I have the results from a survey I sent to my friends:
DATA Questionnaire;
INPUT make $ Question_Score ;
CARDS;
Ned 1
Shadowmoon 2
Heisenberg 1
Athelstan 4
Arnold 5
;
RUN;
What I want to do, using SAS, is to export this table into SPSS (.sav), and also have the value labels for the Question_Score, like shown in the picture below:
I then proceed to create a format in SAS (in hope this would do it):
PROC FORMAT;
VALUE Question_Score_frmt
1="Totally Agree"
2="Agree"
3="Neutral"
4="Disagree"
5="Totally Disagree"
;
run;
PROC FREQ DATA=Questionnaire;
FORMAT Question_Score Question_Score_frmt.
;
TABLES Question_Score;
RUN;
and finally export the table to a .sav file using the fmtlib option:
proc export data=Questionnaire outfile="D:\Questionnaire.sav"
dbms=spss replace;
fmtlib=work.Q1frmt;
quit;
Only to disappoint myself seeing that it didn't work.
Any ideas on how to do this?
You didn't apply the format to the dataset, unfortunately, you applied it to the proc freq. You would need to use PROC DATASETS or a data step to apply it to the dataset.
proc datasets lib=work;
modify questionnaire;
format Question_Score Question_Score_frmt.;
run;
quit;
Then exporting will include the format, if it's compatible in SAS's opinion with SPSS's value label rules. I will note that SAS's understanding of SPSS's rules is quite old, based on I think SPSS version 9, and so it's fairly often that it won't work still, unfortunately.

Normalize a variable (divide by its total)

I have a variable of weight, wprm, that takes integer values. I would like to have one that is the weight "normalized", that is to say wprm/sum(wprm)
I can do that by outputing a proc summary ant then a merge to put it back with the original data, and then dividing my wprm variable, but it seems a bit heavy, is there a simpler way ?
Use PROC STDIZE or PROC STANDARD - they both allow various normalization methods.
proc stdize data=have method=sum out=want;
var wprm;
run;
You can grab the macro %simple_normalize from here.
data test;
do i=1 to 10;
output;
end;
run;
%simple_normalize(test,i);
The other common option is SQL, but it will post a warning/note to the log that many people don't like.
proc sql;
create table want as
select a.*, a.wprm/sum(a.wprm) as weight
from have;
quit;

Running All Variables Through a Function in SAS

I am new to SAS and need to sgplot 112 variables. The variable names are all very different and may change over time. How can I call each variable in the statement without having to list all of them?
Here is what I have done so far:
%macro graph(var);
proc sgplot data=monthly;
series x=date y=var;
title 'var';
run;
%mend;
%graph(gdp);
%graph(lbr);
The above code can be a pain since I have to list 112 %graph() lines and then change the names in the future as the variable names change.
Thanks for the help in advance.
List processing is the concept you need to deal with something like this. You can also use BY group processing or in the case of graphing Paneling in some cases to approach this issue.
Create a dataset from a source convenient to you that contains the list of variables. This could be an excel or text file, or it could be created from your data if there's a way to programmatically tell which variables you need.
Then you can use any of a number of methods to produce this:
proc sql;
select cats('%graph(',var,')')
into: graphlist separated by ' '
from yourdata;
quit;
&graphlist
For example.
In your case, you could also generate a vertical dataset with one row per variable, which might be easier to determine which variables are correct:
data citiwk;
set sashelp.citiwk;
var='COM';
val=WSPCA;
output;
var='UTI';
val=WSPUA;
output;
var='INDU';
val=WSPIA;
output;
val=WSPGLT;
var='GOV';
output;
keep val var date;
run;
proc sort data=citiwk;
by var date;
run;
proc sgplot data=citiwk;
by var;
series x=date y=val;
run;
While I hardcoded those four, you could easily create an array and use VNAME() to get the variable name or VLABEL() to get the variable label of each array element.