Dynamic n in function LAG<n> (variable) SAS - sas

do you know how to use n in function LAGn(variable) that refer to another macro variable in the program-> max in my case?
data example1;
input value;
datalines;
1.0
3.0
1.0
1.0
4.0
1.0
1.0
2.0
4.0
2.0
;
proc means data=example1 max;
output out=example11 max=max;
run;
data example1;
%let n = max;
lagval=lag&n.(value);
run;
proc print data=example1;
run;
Thank you in advance!
Wiola

Is this what you're trying to do?
data example1;
input value;
datalines;
1.0
3.0
1.0
1.0
4.0
1.0
1.0
2.0
4.0
2.0
;
proc sql;
select max(value) format = 1. into :n
from example1;
quit;
data example1;
set example1;
lagval=lag&n(value);
run;
The format = 1. bit makes sure that the macro variable generated by proc sql doesn't contain any leading or trailing spaces that would mess up the subsequent data step code.

It is easy to use a macro variable to generate the N part of LAGn() function call.
%let n=4 ;
data want;
set have ;
newvar = lag&n(oldvar);
run;
Remember that macro code is evaluated by the macro pre-processor and then the generated code is executed by SAS. So placing %LET statements in the middle of a data step is just going to confuse the human programmer.

Related

how to generate unique random vector on each iteration?

I'm new to SAS, I would like to produce plot for each random numerical vector.
therefore I have wrapped my proc iml with a macro, and have tried to invoke it before calling the macro generate_scatter_plot. but I get the same set of points each iteration.
Can somebody please explain what is the proper way to do it SAS.
%MACRO generate_random_points();
proc iml;
N = 6;
rands = j(N,1);
call randgen(rands, 'Uniform'); /* SAS/IML 12.1 */
submit rands;
data my_data;
input x y ##;
datalines;
&rands
;
run;
endsubmit;
%MEND;
%MACRO generate_scatter_plot();
/* call execute('%generate_random_points();'); */
proc sgplot data=my_data;
scatter x=x y=y;
run;
%MEND;
data _null_;
do i = 1 to 20;
call execute('%generate_scatter_plot();');
end;
run;
I find SAS different from the rest of languages out there.
Thank you in advance to all who are willing to help!
IML is not needed, a data step loop can generate the random values
Assuming you're looking at learning macro programming
CALL EXECUTE is required in a data step but not outside the data step
CALL EXECUTE can also generate code similar to macro
MPRINT/MLOGIC options help when debugging macro code otherwise code is not displayed to log
The following expands a bit on your logic to demonstrate the functionality of macro's.
options mprint mlogic;
%macro generate_random_points(Num=);
*Macro to generate random numbers;
*number of points generated are equal to the NUM=parameter;
data my_data;
do i=1 to &num.;
x=rand('uniform');
y=rand('uniform');
output;
end;
run;
%mend;
%macro generate_scatter_plot(Num_Points=);
*create random data with specified points;
%generate_random_points(Num=&Num_Points);
*graph data;
proc sgplot data=my_data;
scatter x=x y=y;
run;
%MEND;
*Run macro with different parameters in loop;
data _null_;
do i = 3 to 5;
call execute(catt('%generate_scatter_plot(Num_Points=', i, ');'));
end;
run;
option nomprint nomlogic;
And a slight variation on your process:
data _null_;
do i = 3 to 5;
call execute(catt("Title 'Num Points = ", i, " '; ", ' %generate_scatter_plot(Num_Points=', i, ');'));
end;
run;
If you are working in IML you should not have any need to use the SAS macro language to generate code.
You already showed how you can generate the random numbers into a IML matrix.
And you can use the SUBMIT/ENDSUBMIT block to call your PROC SGPLOT code.
What you seem to be missing is the IML syntax for converting a matrix into a dataset. https://blogs.sas.com/content/iml/2011/04/18/writing-data-from-a-matrix-to-a-sas-data-set.html
proc iml;
N = 6;
x = t(1:N);
y = j(N,1);
call randgen(y, 'Uniform');
create my_data var {x y};
append;
close my_data;
submit;
proc sgplot data=my_data;
scatter x=x y=y;
run;
endsubmit;
quit;
Although you are using IML to pass data as text into a datalines statement, you really do not need to do this. There are simpler ways of achieving your goal.
SAS does everything through datasets. They're analogous to Data Frames in Pandas. If you want to create a random vector of data, you'll create it within a dataset and use that within other procedures. datalines should be avoided in production whenever possible. There are some very special cases where it is useful, but it's mainly used for sample data or prototyping.
SAS will randomly generate data based on the system clock unless you set a seed through call streaminit(). You should always get new points. A much simpler way to achieve your results is shown below. The below macro will generate a new random dataset and plot it each time you call it.
%macro generate_scatter_plot(n=100);
data random;
do i = 1 to &n;
x = rand('uniform');
y = rand('uniform');
output;
end;
drop i;
run;
proc sgplot data=random;
scatter x=x y=y;
run;
%mend;
%generate_scatter_plot(n=100);
%generate_scatter_plot(n=1000);

SAS - Create Dummy Variables for All Variables

I have a dataset with X number of categorical variables for a given record. I would like to somehow turn this dataset into a new dataset with dummy variables, but I want to have one command / macro that will take the dataset and make the dummy variables for all variables in the dataset.
I also dont want to specify the name of each variable, because I could have a dataset with 50 variables so it would be too cumbersome to have to specify each variable name.
Lets say I have a table like this, and I want the resulting table, with the above conditions that I want a single command or single macro without specifying each individual variable:
You can use PROC GLMSELECT to generate the design matrix, which is what you are asking for.
data test;
input id v1 $ v2 $ v3 $ ;
datalines;
1 A A A
2 B B B
3 C C C
4 A B C
5 B A A
6 C B A
;
proc glmselect data=test outdesign(fullmodel)=test_design noprint ;
class v1 -- v3;
model id = v1 -- v3 /selection=none noint;
run;
You can use the -- to specify all variables between the first and last. Notice I don't have to type v2. So if you know first and the last, you can get want you want easily.
I prefer GLMMOD myself. One note, if you can, CLASS variables are usually a better way to go, but not supported by all PROCS.
/*Run model within PROC GLMMOD for it to create design matrix
Include all variables that might be in the model*/
proc glmmod data=sashelp.class outdesign=want outparm=p;
class sex age;
model weight=sex age height;
run;
/*Create rename statement automatically
THIS WILL NOT WORK IF YOUR VARIABLE NAMES WILL END UP OVER 32 CHARS*/
data p;
set p;
if _n_=1 and effname='Intercept' then
var='Col1=Intercept';
else
var=catt("Col", _colnum_, "=", catx("_", effname, vvaluex(effname)));
run;
proc sql ;
select var into :rename_list separated by " " from p;
quit;
/*Rename variables*/
proc datasets library=work nodetails nolist;
modify want;
rename &rename_list;
run;
quit;
proc print data=want;
run;
Originally from here and the post has links to several other methods.
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-dummy-variables-Categorical-Variables/ta-p/308484
Here is a worked example using your simple three observation dataset and a modified version of the PROC GLMMOD method posted by #Reeza
First let's make a sample dataset with a long character ID variable. We will introduce a numeric ROW variable that we can later use to merge the design matrix back with the input data.
data have;
input id :$21. education_lvl $ income_lvl $ ;
row+1;
datalines;
1 A A
2 B B
3 C C
;
You could set the list of variables into a macro variable since we will need to use it in multiple places.
%let varlist=education_lvl income_lvl;
Use PROC GLMMOD to generate the design matrix and the parameter list that we will later use to generate user friendly variable names.
proc glmmod data=have outdesign=design outparm=parm noprint;
class &varlist;
model row=&varlist / noint ;
run;
Now let's use the parameter list to generate rename statement to a temporary text file.
filename code temp;
data _null_;
set parm end=eof;
length rename $65 ;
rename = catx('=',cats('col',_colnum_),catx('_',effname,of &varlist));
file code ;
if _n_=1 then put 'rename ' ;
put #3 rename ;
if eof then put ';' ;
run;
Now let's merge back with the input data and rename the variables in the design matrix.
data want;
merge have design;
by row ;
%inc code / source2;
run;

PROC REPORT within DATA in SAS

I am trying to do a simple thing - write a PROC REPORT procedure within a DATA sentence. My main idea is - if the condition in data step is true - lets execute PROC REPORT, if it is false - do not execute PROC REPORT. Any ideas? Code runs without errors for now, but I see that condition in IF statement is not applied and PROC REPORT is ececute despite the fact that condition is not fulfilled.
Thank you in Advance.
%let DATO = 13062016;
PROC IMPORT OUT= WORK.auto1 DATAFILE= "C:\Users\BC1554\Desktop\andel.xlsx"
DBMS=xlsx REPLACE;
SHEET="sheet1";
GETNAMES=YES;
RUN;
data want;
set WORK.auto1;
rownum=_n_;
run;
DATA tbl2;
SET want;
if (rownum => 1 and rownum <=6 ) then output work.tbl2 ;
RUN;
ODS NORESULTS;
ods LISTING close;
ODS RTF FILE="C:\Users\BC1554\Desktop\Statistik_andel_&DATO..rtf";
title "Statistics from monthly run of DK shares of housing companies (andelsboliger)";
data Tbl21 ;
set work.Tbl2;
where (DKANDEL='Daekning_pct_24052016' or DKANDEL='Daekning_pct_18042016') ;
difference = dif(Andel);
difference1 = dif(Total);
run;
data Tbl211 ;
set work.Tbl21;
where (DKANDEL='Daekning_pct_18042016') ;
run;
data Tbl2111 ;
set work.Tbl211;
where (DKANDEL='Daekning_pct_18042016') ;
if abs(difference) > 10 and abs (difference1) > 107 then ;
run;
proc report data= work.Tbl2 spanrows;
columns DKANDEL Andel Total Ukendt ;
title2 "-";
title3 "We REPORT numbers on p.4-5".;
title4 "-";
title5 "The models coverage";
title6 "Run date &DATO.";
footnote1 "Assets without currency code not included";
define DKANDEL / order;
define Andel / order;
define Total / order;
define Ukendt / order;
define DKANDEL/ display;
define Andel / display;
Compute DKANDEL;
call define (_col_,"style","style={background=orange}");
endcomp;
Compute Andel;
call define (_col_,"style","style={background=red}");
endcomp;
run; title; footnote1;
ODS RTF close;
ODS LISTING;
title;
run;
To conditionally execute code you need to use a macro so that you can use macro logic like %IF to conditionally generate the code.
But for your simple problem you can use a macro variable to modify the RUN; statement on your PROC REPORT step. Create a macro variable and set it to the value CANCEL when you don't want the step to run.
%let cancel=CANCEL;
...
if abs(difference) > 10 and abs (difference1) > 107 then call symputx('cancel','');
...
proc report ... ;
...
run &cancel ;
Simple example. Produce report if anyone is aged 13.
%let cancel=CANCEL;
data _null_;
set sashelp.class ;
if age=13 then call symputx('cancel',' ');
run;
proc report data=sashelp.class ;
run &cancel;
Tom's answer is a good one, and probably what I'd do. But, an alternative that is more exactly what you suggested in the question seems also appropriate.
The way you execute a PROC REPORT in a data step (or execute any non-data-step code in a data step) is with call execute. You can use call execute to execute a macro, or just a string of code; up to you how you want to handle it. I would make it a macro, because that makes development much easier (you can write the macro just like regular code, and you can test it independently).
Here's a simple example that is analogous to what Tom put in his answer.
%macro print_report(data=);
proc report data=&data.;
run;
%mend print_report;
data _null_;
set sashelp.class ;
if age=13 then do;
call execute('%print_report(data=sashelp.class)');
stop; *prevent it from donig this more than once;
end;
run;

How to assign the result from %macro to a macro variable

I have data set with probabilities to purchase a particular product per observation. Here is an example:
DATA probabilities;
INPUT id P_prod1 P_prod2 P_prod3 ;
DATALINES;
1 0.02 0.5 0.32
2 0.6 0.08 0.12
3 0.8 0.34 0.001
;
I need to calculate the median for each product. Here's how I do that:
%macro get_median (product);
proc means data=probabilities median;
var &product ;
output out=median_data (drop=_type _freq_) median=median;
run;
%mend;
At this point I can get the median for each product by calling
%get_median(P_product1);
Now, the last thing that I want to do is to assign the numeric result for the median to a macro variable. My best guess for how to do that would be something like:
%let med_P_prod1=%get_median(P_prod1);
but unfortunately that does not work.
Can someone help, please?
Cheers!
The simplest solution is to define a %global macro variable and set the let statement to the numeric result inside the macro.
%macro get_median (product);
proc means data=probabilities median;
var &product ;
output out=median_data (drop=_type _freq_) median=median;
run;
%global macroresult;
proc sql;
select median into :macroresult separated by ' ' from median_data;
quit;
%mend;
(That SQL statement is equivalent to LET in that it defines a macro variable, but it is better at getting results from data.)
I'd also recommend just using the dataset in your code rather than putting the value in a macro variable.

SAS - How to get last 'n' observations from a dataset?

How can you create a SAS data set from another dataset using only the last n observations from original dataset. This is easy when you know the value of n. If I don't know 'n' how can this be done?
This assumes you have a macro variable that says how many observations you want. NOBS tells you the number of observations in the dataset currently without reading the whole thing.
%let obswant=5;
data want;
set sashelp.class nobs=obscount;
if _n_ gt (obscount-&obswant.);
run;
Using Joe's example of a macro variable to specify the number of observations you want, here is another answer:
%let obswant = 10;
data want;
do _i_=nobs-(&obswant-1) to nobs;
set have point=_i_ nobs=nobs;
output;
end;
stop; /* Needed to stop data step */
run;
This should perform better since it only reads the specific observations you want.
If the dataset is large, you might not want to read the whole dataset. Instead you could try a construction that reads the total number of Observations in the dataset first. So if you want to have the last of observations:
data t;
input x;
datalines;
1
2
3
4
;
%let dsid=%sysfunc(open(t));
%let num=%sysfunc(attrn(&dsid,nlobs));
%let rc=%sysfunc(close(&dsid));
%let number = 2;
data tt;
set t (firstobs = %eval(&num.-&number.+1));
run;
For the sake of variety, here's another approach (not necessarily a better one)
%let obswant=5;
proc sql noprint;
select nlobs-&obswant.+1 into :obscalc
from dictionary.tables
where libname='SASHELP' and upcase(memname)='CLASS';
quit;
data want;
set sashelp.class (firstobs=&obscalc.);
run;
You can achive this using the
_nobs_ and _n_ variables. First, create a temporary variable to store the total no of obs. Then compare the automatic variable N to nobs.
data a;
set sashelp.class nobs=_nobs_;
if _N_ gt _nobs_ -5;
run;