here is my table
data:
ax bx cx dx ex fx
1 2 3 4 5 5
2 3 5 1 0 5
3 7 8 9 1 4
here is my basic code
%macro example(c= , b= ,a= );
data temp;
set data;
diff = &c-(&b+&a);
run;
%mend example;
% example(c=cx ,b=bx ,a=ax)
I want to automize diff = c-(b+a) by setting a prompt-like feature in SAS EG but I do not know how to do it? My aim is to be able change my features(for example instead cx, I want to put f or instead ax,e and so on) in "diff" equation because my actual data consists of thousands of columns.
If you help me, I appreciate.
To automate this, you'd probably want to make three prompts. One for each variable (c,b,a). (Of course, call them something descriptive, not c,b,a!) Select "use throughout project" and "requires non blank value". Maybe add some more useful text to describe what they are.
Then, you need to have a way of populating them. You can either populate them from a static list (enter the possible values in), just as open text boxes where you'll type them in yourself each time, or you can populate them from a data source. The mechanics of populating from a data source depends on your local setup - are you using "local EG" or is it EG connected to a metadata server, for example - but overall it should be fairly straightforward.
Either on "User selects values from a static list", select "Get values", then "Browse" for the SAS data file; or "User selects values from a dynamic list", do the same. The latter will always check the data source for updates, while the former just populates the list at prompt creation time.
Finally, in your program, your macro call would then look like:
%example(c=&c ,b=&b ,a=&a)
where &c &b &a are the prompt names (the 'short' name if you gave it a longer text name also).
What you want is something like this:
/*Define Prompt*/
%window info
#5 #5 'Var1:'
#5 #13 var1 2 attr=underline
#7 #5 'Var2:'
#7 #13 Var2 2 attr=underline
#9 #5 'Var3:'
#9 #13 Var3 2 attr=underline;
/*Show Prompt*/
%display info;
/*Display Macro Variables in the Log*/
%put &var1;
%put &var2;
%put &var3;
%example(c=&var1 ,b=&var2 ,a=&var3)
Related
I have the following code where users will be presented with the following window and they are to enter a text
Code:
%let study_code=;
%macro startme ;
%global study_code;
%window first
#3 #45 'Electronic Filing System' color=blue ////
#20 'Study code:' color=black +2 study_code 30 color=green required=yes attr=underline //
#10 '**************** Hit ENTER to begin ******************' color=green
;
%display first ;
%let study_code_new=%sysfunc(strip(%nrbquote(&study_code)));
%put &study_code_new.;
%mend;
%startme;
Window presented when run:
I type 123, hit Enter and it outputs 123 in the logs as expected:
However, if a user enters 123" by accident in the field, I am presented with the single quote error:
ERROR: Literal contains unmatched quote.
ERROR: The macro STARTME will stop executing.
How do I prevent SAS from reading " as code and treat it as literal string? I want to capture it in study_code_new macro variable so that I can tell the user that they have mistyped it.
It is not the %WINDOW command or the %DISPLAY command that is the issue. It is the code that you write that uses the macro variable's value. You need to add macro quoting.
So first immediately add macro quoting to the macro variable populated by the %DISPLAY statement call.
%window first
#3 #45 'Electronic Filing System' color=blue
#7 #20 'Study code:' color=black +2 study_code 30 color=green
required=yes attr=underline
#9 #10 '**************** Hit ENTER to begin ******************' color=green
;
%display first ;
%let study_code=%superq(study_code);
Then make sure to keep the macro quoting on any macro variable you derive from it (at least until you are sure it no longer needs the macro quoting).
46 %window first
47 #3 #45 'Electronic Filing System' color=blue
48 #7 #20 'Study code:' color=black +2 study_code 30 color=green required=yes attr=underline
49 #9 #10 '**************** Hit ENTER to begin ******************' color=green
50 ;
51 %display first ;
52 %let study_code=%superq(study_code);
53 %let study_code_new=%qsysfunc(strip(&study_code));
54 %put &=study_code &=study_code_new;
STUDY_CODE= 123" STUDY_CODE_NEW=123"
I have two datasets in two different SAS tables that also have completely different data structures. I am being asked (not my idea) to export these datasets to one .dat file and essentially stack them on top of each other using a fixed width method. The below listed snippet of data is how the export should ultimately look when it gets to the .dat file. The first row is the result of the first dataset. The second row is result of the second dataset.
UH INCR000000XXXXXXXXXXXXXXXX
XXX SFLXXXXXXXXXXXX 000 M SMITH XXXXXX XXXXXXXXXXXXX9991231
I cant figure out exactly how to do this. Below is the code I've come up with that exports the data but the second data step just overwrites the first.
Here's an example using the MOD option on the FILE statement.
Note this may not work on all OS's.
filename test1 '/home/reeza/Demo1/testfile.dat';
data exportClass;
set sashelp.class;
file test1;
if _n_=1 then do;
put #1 "Name" #20 "Age" #30 "Sex";
end;
put #1 Name #20 Age #30 Sex;
run;
data exportClass;
set sashelp.class;
file test1 mod;
if _n_=1 then do;
put #1 "Name" #20 "Weight" #30 "Height";
end;
put #1 Name #20 Weight #30 Height;
run;
filename test1;
I'm very new to SAS, trying to learn everything I need for my analytical task. The task I have now is to create a flag for the ongoing application. I think it might be easier to show it in a table, just to illustrate my problem:enter image description here
[Update 2017.10.27] data sample in code, big thanks to Richard :)
data sample;
input PeopleID ApplicationID Applied_date yymmdd10. Decision_date yymmdd10. Ongoing_flag_wanted;
format Applied_date Decision_date yymmdd10.;
datalines;
1 6 2017.10.1 2017.10.1 1
1 5 2017.10.1 2017.10.4 0
1 3 2017.9.28 2017.9.29 1
1 2 2017.9.26 2017.9.26 1
1 1 2017.9.25 2017.9.30 0
2 8 2017.10.7 2017.10.7 1
2 7 2017.10.2 . 0
3 4 2017.9.30 2017.10.3 0
run;
In the system, people apply for the service. When a person does that, he gets a PeopleID, which does not change when the person applies again. And also each application gets an applicationID, which is unique and later applications have larger applicationID. What I want is to create an Ongoing flag for each application. The propose is to show that: by the time this application came in, the same person has or does not have an ongoing application (application which has not received a decision). See some examples from the table above:
Person#2 has two applications #8 and #7, by the time he applied #8, #7 has not been decided, therefore #8 should get ongoing flag.
Person#1 applied multiple times. Application #3 and #2 have ongoing application due to App#1. Application #6 and #5 came in at the same date, but according to application ID, we can tell that #6 came in later than #5, and as #5 have not been decided by then, #6 gets ongoing flag.
As you might notice, application with a positive ongoing flag always receives decisions on the same date as it came in. That is because applications with ongoing cases are automatically declined. However, I cannot use this as an indicator: there are many other reasons that trigger an automatic decline.
The ongoing_flag is what I want to create in my dataset. I have tried to sort by 1.peopleID, 2.descending applicationID, 3. descending applied_date, so my entire dataset looks like the small example table above. But then I don't know how to make SAS compare within the same variable (peopleID) but different lines (applicationID) and columns (compare Applied_date with Decision_date). I want to compare, for each person, every application's applied_date with all the previous applications' decision_date, such that I can tell by the time this application came in, whether or not there is an ongoing application from previously in the system.
I know I used too many words to explain my problem. For those who read through, thank you for reading! For those who have any idea on what might be a good approach, please leave your comments! Millions of thanks!
Min:
For problems of this type you want to mentally break the data structure into different parts.
BY GROUP
The variables whose unique combination defines the group. There are one or more rows in a group. Let's call them items.
GROUP DETAILS
Variables that are observational in nature. They may be numbers such as temperature, weight or dollars, or, characters or strings that represent some state being tracked. The details (at the state you are working) themselves might be aggregates for a deeper level of detail.
GOAL
Compute additional variables that further elucidate an aspect of the details over the group. For numeric the goal might be statistical such as MIN, MAX, MEAN, MEDIAN, RANGE, etc. Or it might be identificational such as which ID had
highest $, or which name was longest, or any other business rule.
Your specific problem is one of determining claim activity on a given date. I think of it as a coverage type of problem because the dates in question cover a range. The BY GROUP is person and an 'Activity' date.
Here is one data-centric approach. The original data is expanded to have one row per date from applied to decided. Then simple BY group processing and the automatic first. are used to determine if an application is during one as yet undecided.
data have;
input PeopleID ApplicationID Applied_date yymmdd10. Decision_date yymmdd10. Ongoing_flag_wanted;
format Applied_date Decision_date yymmdd10.;
datalines;
1 6 2017.10.1 2017.10.1 1
1 5 2017.10.1 2017.10.4 0
1 3 2017.9.28 2017.9.29 1
1 2 2017.9.26 2017.9.26 1
1 1 2017.9.25 2017.9.30 0
2 8 2017.10.7 2017.10.7 1
2 7 2017.10.2 . 0
3 4 2017.9.30 2017.10.3 0
run;
data coverage;
do _n_ = 1 by 1 until (last.PeopleID);
set have;
by PeopleID;
if Decision_date > Max_date then Max_date = Decision_date;
end;
put 'NOTE: ' PeopleID= Max_date= yymmdd10.;
do _n_ = 1 to _n_;
set have;
do Activity_date = Applied_date to ifn(missing(Decision_date),Max_date,Decision_date);
if missing(Decision_date) then Decision_date = Max_date;
output;
end;
end;
keep PeopleID ApplicationID Applied_date Decision_date Activity_date;
format Activity_date yymmdd10.;
run;
proc sort data=coverage;
by PeopleID Activity_date ApplicationID ;
run;
data overlap;
set coverage;
by PeopleID Activity_date;
Ongoing_flag = not (first.Activity_date);
if Activity_date = Applied_date then
output;
run;
proc sort data=overlap;
by PeopleID descending ApplicationID ;
run;
Other approaches could involve arrays, hashes, or SQL. SQL is very different from DATA Step code and some consider it to be more clear.
proc sql;
create table want as
select
PeopleID, ApplicationID, Applied_date, Decision_date
, case
when exists (
select * from have as inner
where inner.PeopleID = outer.PeopleID
and inner.ApplicationID < outer.ApplicationID
and
case
when inner.Decision_date is null and outer.Decision_date is null then 1
when inner.Decision_date is null then 1
when outer.Decision_date is null then 0
else outer.Decision_date < inner.Decision_date
end
)
then 1
else 0
end as Ongoing_flag
from have as outer
;
How to add input box to sas sql query which ask user about parameter ? (Something aka Access input box) (in Enterprise Guide)
Here is a solution using BASE -
You could use the %Window procedure with the %display
DATA _NULL_;
%LET BATCH1=;
%WINDOW BATCH_ANALYSIS COLOR = WHITE
ICOLUMN = 30 IROW = 11
COLUMNS = 88 ROWS = 20
#1 #28 "CLIENT BATCH REPORT"
#4 #12 "Date must be entered YYYY-MM-DD Format, ascending order."
#6 #28 "Example = '2015-01-31'"
#9 #5 "Enter Batch Date - [ENTER] when complete:"
#11 #5 BATCH1 12 attr=underline
#13 #5 "Reports will be written to 'location'";
%DISPLAY BATCH_ANALYSIS;
STOP;
RUN;
%put %batch1;
This above is an example of using the "user input" to operate on your query/data step. In this case, I am prompting the user to enter a date, which creates that string value as a macro variable that can be passed anywhere in your SAS code (I am only using the string date format because it gets passed to an RSUBMIT in a DB2 environment). May be a good idea to play with the Input Lines/etc to display the text you want in your prompt window...
Are you using Enterprise Guide?
If thats the case, you can create prompts which will create macro variables when you run your code.
You will just have to use those macro variables in your code.
Right click your program > Properties > Prompts > Prompt Manager and so on.
Have a look at it and see if it solves your problem.
I am working with survey data where the variable names in our database are descriptive, and not sequentially numbered. They are sequential in the database (moving from left to right). I would like to work in my programs with numbered variables, and I have been unsuccessful in trying to rename them programmatically without having to write out every change by hand (there are 87 total variables).
I have tried to use array, but that has not worked since they are not named sequentially nor do they have a common structure (no common prefix or suffix).
Example data is below:
data svy;
input id relationship outburst checkwork goodideas ;
cards;
101 3 4 5 6
102 4 5 6 6
103 1 1 8 1
104 2 3 2 4
;
run;
***** does not work ;
data svy_1; set svy;
rename relationship--goodideas = var01--var04;
run;
quit;
The above code returns the following error in the log:
ERROR: Missing numeric suffix on a numbered variable list (relationship-goodideas).
I would like to rename the variables to something like: var01, var02, etc...
Any help is greatly appreciated.
A few things:
Your data step #2 isn't right - it doesn't have a set statement. Also, it doesn't require 'quit' - quit is only for certain PROCs that generally are 'programming environments', such as PROC SQL, PROC FORMAT, PROC DATASETS. It doesn't do any harm but it looks odd :)
Sequential-in-the-dataset variable lists are double dash. So, you could trivially create an array with these:
array myvars relationship--goodideas;
So if that's good enough for you (no rename), then, go for it. If you really want to rename them (a bit of a bad idea IMO since it takes away some meaning of the variable name, making code harder to read, though I understand the reasoning why you'd want to), you can't use this unfortunately - while it's correct, the RENAME statement does not support it.
82 ***** does not work ;
83 data svy_1;
84 rename relationship--goodideas = var01-var04;
------------
47
ERROR 47-185: Given form of variable list is not supported by RENAME. Statement is ignored.
85 run;
You cannot use an array to perform rename statements, unfortunately; so you'll have to do something else. Here's one answer.
proc contents data=svy out=svy_vars(keep=name varnum) noprint;
run;
proc sort data=svy_vars;
by varnum;
run;
data for_rename;
set svy_vars;
if name in ('relationship' 'outburst' 'checkwork' 'goodideas') then do;
namectr+1;
new_name=cats(name,'=','var',put(namectr,z2.));
output;
end;
run;
proc sql;
select new_name into :renlist separated by ' ' from for_rename;
quit;
proc datasets nolist;
modify svy;
rename &renlist;
quit;
You can do something similar in a shorter fashion using PROC SQL and the DICTIONARY.COLUMNS table, or a data step and SASHELP.VCOLUMN, but the proc contents method is somewhat more transparent as to what's happening. If you have more than four variables, you may want to change that IN statement into a negative statement (if name not in (list of things to not change)) if that's easier, or even use the VARNUM variable itself to determine which variables you want to change (if varnum in (2:5) would work there).
A colleague came up with the best approach:
***** does work ;
data svy_1;
set svy;
array old { 4 } relationship--goodideas;
array var { 4 } ;
do i = 1 to 4;
var[i] = old[i];
end;
drop i;
run;