i have a data set with multiple attributes and each attribute has 10-15 rows each in the master table. i wish to use a do loop on the data set which would allow me to extract outputs for each attribute seperately. my concern is how to automate the selection of attribute in the do loop once the previous attribute's output is extracted??
thanks in advance.
I'm not completely sure what you're asking to do, but I can hopefully show the basic ideas of a do loop.
%macro YOUR_MACRO();
%let YOUR_VARIABLE = 1 2 3 ...; /*This could be whatever you want to split up from your master table*/
%let NUM_VAR = 3; /*Change this to the number of YOUR_VARIABLEs listed*/
%do i = 1 %to &NUM_VAR. %by 1;
%let LOOP_VAR = %scan(&YOUR_VARIABLE., %i.);
/*This do i = 1 starts your loop at 1 and goes up by 1 until your NUM_VAR is reached*/
proc sql;
create table TABLE_&LOOP_VAR. as /*Creates a specific table for each variable*/
select *
from MASTER_TABLE
where COLUMN_NAME = &LOOP_VAR. /*Splits up your table by a certain attribute equaling the loop variable*/
;
quit;
%end;
%mend;
%YOUR_MACRO(); /*Runs your loop*/
This is the basic structure and should give a little help. You can also just scan your master table for each variable name then separate it by that without having to type each one out.
Related
I have a SAS macro that, when given different arguments, creates several tables. It looks something like this:
%macro create_tables(key, value);
data WORK.TABLE_&key.:
set WORK.MAIN_TABLE;
where col = &value.;
col_&key. = 1;
drop col;
%mend create_tables;
The key parameter/macro variable is injected into the table name. I call this macro several times with different keys and values.
I want to convert this piece of code into Teradata syntax. I can create multiple tables for every key and value, but I have 30+ keys and values. What would be the best way to achieve this in Teradata? Would creating multiple tables be more efficient? The number of rows for each table created will be between 1 million and 2 million, and the MAIN_TABLE has 30+ million rows.
I agree with Tom that it is not obvious why you create a lot of new tables that just contain a certain part of your original data.
So, if you want to have these extra columns (one column per value of column col) then imho the following code seems to be the best solution in SAS.
Here, a view is created by a SAS macro where you can specify one or more values via the macro parameter KEY. For each value there will be a new column with the values 1 and 0 in the resulting view.
If you want only the rows with one certain value then you can do that based on this view.
%macro create_colkeys (KEY=);
%let NUMWORDS = %sysfunc(countw(&KEY.));
%do i=1 %to &NUMWORDS.;
%let KEY_&i.=%scan(&KEY,&i.);
%end;
proc sql;
create view col_flags as
select
%do i=1 %to &NUMWORDS.;
case when col="&&KEY_&i." then 1
else 0
end as col_&&KEY_&i..,
%end;
*
from main_table;
quit;
%mend;
%create_colkeys(KEY = abc def xyz);
I've output 'Moments' from Proc Univariate to datasets. Many.
Example: Moments_001.sas7bdat through to Moments_237.sas7bdat
For the first column of each dataset (new added first column, and probably new dataset, as opposed to the original) I would like to have a particular text in every cell going down to bottom row.
The exact text would be the name of the respective dataset file: say, "Moments_001".
I do not have to 'grab' the filename, per se, if that's not possible. As I know what the names are already, I can put that text into the procedure. However, grabbing the filenames, if possible, would be easier from my standpoint.
I'd greatly appreciate any help anyone could provide to accomplish this.
Thanks,
Nicholas Kormanik
Are you looking for the INDSNAME option of the SET statement? You need to define two variables because the one generated by the option is automatically dropped.
data want;
length moment dsn $41 ;
set Moments_001 - Moments_237 indsname=dsn ;
moment=dsn;
run;
I think something along these lines should be what you're after. Assuming you have a list of moments, you can loop through it and add a new variable as the first column of each dataset.
%let list_of_moments = moments_001 moments_002 ... moments_237;
%macro your_macro;
%do i = 1 %to %sysfunc(countw(&list_of_moments.));
%let this_moment = %scan(&list_of_moments., &i.);
data &this_moment._v2;
retain new_variable;
set &this_moment.;
new_variable = "&this_moment.";
run;
%end;
%mend your_macro;
%your_macro;
The brute force entering of text into column 1 looks like this:
data moments_001;
length text $ 16;
set moments_001;
text="Moments_001";
run;
You could also write a macro that would loop through all 237 data sets and insert the text.
UNTESTED CODE
%macro do_all;
%do i=1 %to 237;
%let num = %sysfunc(putn(&i,z3.));
data moments_#
length text & 16;
set moments_#
text="Moments_&num";
run;
%end;
%mend
%do_all
It seems to me (not knowing your problem) that if you use PROC UNIVARIATE with the BY option, then you wouldn't need 237 different data sets, all of your output would be in one data set and the BY variable would also be in the data set. Does that solve your problem?
I have a set of input macro variables in SAS. They are dynamic and generated based on the user selection in a sas stored process.
For example:There are 10 input values 1 to 10.
The name of the macro variable is VAR_. If a user selects 2,5,7 then 4 macro variables are created.
&VAR_0=3;
&VAR_=2;
&VAR_1=5;
&VAR_2=7;
The first one with suffix 0 provides the count. The next 3 provides the values.
Note:If a user select only one value then only one macro variable is created. For example If a user selects 9 then &var_=9; will be created. There will not be any count macro variable.
I am trying to create a sas table using these variables.
It should be like this
OBS VAR
-----------
1 2
2 5
3 7
-----------
This is what I tried. Not sure if this is the right way to do approach it.
It doesn't give me a final solution but I can atleast get the name of the macro variables in a table. How can I get their values ?
data tbl1;
do I=1 to &var_0;
VAR=CAT('&VAR_',I-1);
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE TBL2 AS
SELECT I,
CASE WHEN VAR= '&VAR_0' THEN '&VAR_' ELSE VAR END AS VAR
from TBL1;
QUIT;
Thank You for your help.
Jay
SAS helpfully stores them in a table for you already, you just need to parse out the ones you want. The table is called SASHELP.VMACRO or DICTIONARY.MACROS
Here's an example:
%let var=1;
%let var2=3;
%let var4=5;
proc sql;
create table want as
select * from sashelp.vmacro
where name like 'VAR%';
quit;
proc print data=want;
run;
I think the real issue is the inconsistent behavior of the stored process. It only creates the 0 and 1 variable when there are multiple selections. I think that your example is a little off. If the value of VAR_0 is three then their should be a VAR_3 macro variable. Also the value of VAR_ and VAR_1 should be set to the same thing.
To fix this in the past I have done something like this. First let's assign the parameter name a macro variable so that the code is reusable for other programs.
%let name=VAR_;
Then first make sure the minimal macro variables exist.
%global &name &name.0 &name.1 ;
Then make sure that you have a count by setting the 0 variable to 1 when it is empty.
%let &name.0 = %scan(&&&name.0 1,1);
Then make sure that you have a 1 variable. Since it should have the same value as the macro variable without a suffix just re-assign it.
%let &name.1 = &&&name ;
Now your data step is easier.
data want ;
length var $32 value $200 ;
do i=1 to &&&name.0 ;
var=cats(symget('name'),i);
value=symget(var);
output;
end;
run;
I don't understand your numbering scheme and recommend changing it, if you can; the &var_ variable is very confusing.
Anyway, the easiest way to do this is SYMGET. That returns a value from the macro symbol table which you can specify at runtime.
%let VAR_0=3;
%let VAR_=2;
%let VAR_1=5;
%let VAR_2=7;
data want;
do obs = 1 to &var_0.;
var = input(symget(cats('VAR_',ifc(obs=1,'',put(obs-1,2.)))),2.);
output;
end;
run;
I am really struggling with it guys.
The table needs to be updates has ~15M rows and ~200 columns.
I need to update few columns using a work table table.
This is (partly) what I need to do:
%macro condition;
%if &row_count>0 %then %do;
data _null_;
set W4TWGKJ6 end=final;
if _n_ = 1 then call execute("proc sql ;");
call execute
("update dds.insurance_policy set X_STORNO_BY_VERSION="||TOSNUM||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";,
update dds.insurance_policy set STATUS_CHANGE_DT="||ISSUE_DT||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";");
if final then call execute('quit;'); run;
%end;
%mend;
%condition;
I first check if there are rows in table (&row_count)
if there are,
I update 2 columns (I need to update 5, I just cut them from the example)
using a work table called W4TWGKJ6.
This update takes forever.
In fact, I stopped the process every single time, as it worked for hours without returning anything....
Does anyone knows a better solution for this problem?
Thanks in advance,
Gal.
I'd suggest using MODIFY statement in datastep:
You should have same column names in both tables for BY variables and have them sorted by those variables.
data dds.insurance_policy;
modify
dds.insurance_policy
W4TWGKJ6 (keep= POLICY_NO X_INSURER_SERIAL_NO /* key variables */
X_STORNO_BY_VERSION STATUS_CHANGE_DT /* ... other variables from source to update target */
updatemode=nomissingcheck;
by POLICY_NO X_INSURER_SERIAL_NO;
if _iorc_ = %sysrc(_SOK) then do;
* Update row ;
replace;
end;
else _error_ = 0;
run;
See SAS: How not to overwrite a dataset when the "where" condition in a "Modify" statement does not hold? for complete reference of iorc return values.
I have a dataset called have with one entry with multiple variables that look like this:
message reference time qty price
x 101 35000 100 .
the above dataset changes every time in a loop where message can be ="A". If the message="X" then this means to remove 100 qty from the MASTER set where the reference number equals the reference number in the MASTER database. The price=. is because it is already in the MASTER database under reference=101. The MASTER database aggregates all the available orders at some price with quantity available. If in the next loop message="A" then the have dataset would look like this:
message reference time qty price
A 102 35010 150 500
then this mean to add a new reference number to the MASTER database. In other words, to append the line to the MASTER.
I have the following code in my loop to update the quantity in my MASTER database when there is a message X:
data b.master;
modify b.master have(where=(message="X")) updatemode=nomissingcheck;
by order_reference_number;
if _iorc_ = %sysrc(_SOK) then do;
replace;
end;
else if _iorc_ = %sysrc(_DSENMR) then do;
output;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSEMTR) then do;
_error_ = 0;
end;
else if _iorc_ = %sysrc(_DSENOM) then do;
_error_ = 0;
end;
run;
I use the replace to update the quantity. But since my entry for price=. when message is X, the above code sets the price='.' where reference=101 in the MASTER via the replace statement...which I don't want. Hence, I prefer to delete the price column is message=X in the have dataset. But I don't want to delete column price when message=A since I use this code
proc append base=MASTER data=have(where=(msg_type="A")) force;
run;
Hence, I have this code price to my Modify statement:
data have(drop=price_alt);
set have; if message="X" then do;
output;end;
else do; /*I WANT TO MAKE NO CHANGE*/
end;run;
but it doesn't do what I want. If the message is not equal X then I don't want to drop the column. If it is equal X, I want to drop the column. How can I adapt the code above to make it work?
Its a bit of a strange request to be honest, such that it raises questions about whether what you're doing is the best way of doing it. However, in the spirit of answering the question...
The answer by DomPazz gives the option of splitting the data into two possible sets, but if you want code down the line to always refer to a specific data set, this creates its own complications.
You also can't, in the one data step, tell SAS to output to the "same" data set where one instance has a column and one instance doesn't. So what you'd like, therefor, is for the code itself to be dynamic, so that the data step that exists is either one that does drop the column, or one that does not drop the column, depending on whether message=x. The answer to this, dynamic code, like many things in SAS, resolves to the creative use of macros. And it looks something like this:
/* Just making your input data set */
data have;
message='x';
time=35000;
qty=1000;
price=10.05;
price_alt=10.6;
run;
/* Writing the macro */
%macro solution;
%local id rc1 rc2;
%let id=%sysfunc(open(work.have));
%syscall set(id);
%let rc1=%sysfunc(fetchobs(&id, 1));
%let rc2=%sysfunc(close(&id));
%IF &message=x %THEN %DO;
data have(drop=price_alt);
set have;
run;
%END;
%ELSE %DO;
data have;
set have;
run;
%END;
%mend solution;
/* Running the macro */
%solution;
Try this:
data outX(drop=price_alt) outNoX;
set have;
if message = "X" then
output outX;
else
output outNoX;
run;
As #sasfrog says in the comments, a table either has a column or it does not. If you want to subset things where MESSAGE="X" then you can use something like this to create 2 data sets.