SAS Create a dataset with two Column Values - sas

I have got a variable &OldCount in SAS Enterprise Guide and I am trying to create a new dataset called Count1 which has one Column Called VarName and a second column Called ColumnCount. I want to create a row where VarName = "OldCountTotal" and ColumnCount = The value of Oldcount (let's say 56).
How do I do this linking the Column Count value to the macro variable created?
Thanks in advance for helping me.

If I understand you correctly, you would do the following (generating just a single-row dataset):
data Count1;
VarName = 'OldCountTotal';
ColumnCount = &OldCount.;
output;
run;
I assume your variable is defined earlier in the program, something like this:
%let OldCount = 56;

Related

SAS macro to Teradata syntax

I have a SAS macro that, when given different arguments, creates several tables. It looks something like this:
%macro create_tables(key, value);
data WORK.TABLE_&key.:
set WORK.MAIN_TABLE;
where col = &value.;
col_&key. = 1;
drop col;
%mend create_tables;
The key parameter/macro variable is injected into the table name. I call this macro several times with different keys and values.
I want to convert this piece of code into Teradata syntax. I can create multiple tables for every key and value, but I have 30+ keys and values. What would be the best way to achieve this in Teradata? Would creating multiple tables be more efficient? The number of rows for each table created will be between 1 million and 2 million, and the MAIN_TABLE has 30+ million rows.
I agree with Tom that it is not obvious why you create a lot of new tables that just contain a certain part of your original data.
So, if you want to have these extra columns (one column per value of column col) then imho the following code seems to be the best solution in SAS.
Here, a view is created by a SAS macro where you can specify one or more values via the macro parameter KEY. For each value there will be a new column with the values 1 and 0 in the resulting view.
If you want only the rows with one certain value then you can do that based on this view.
%macro create_colkeys (KEY=);
%let NUMWORDS = %sysfunc(countw(&KEY.));
%do i=1 %to &NUMWORDS.;
%let KEY_&i.=%scan(&KEY,&i.);
%end;
proc sql;
create view col_flags as
select
%do i=1 %to &NUMWORDS.;
case when col="&&KEY_&i." then 1
else 0
end as col_&&KEY_&i..,
%end;
*
from main_table;
quit;
%mend;
%create_colkeys(KEY = abc def xyz);

Create a column in SAS using a macro variable

I am trying to create a column using the string value of a macro variable in SAS.
I have a dataset called want7 which has a column called 'ID'. I want to create a new dataset called want8 with a new column called 'ID1' by dynamically linking it to &string1 (as in the name of the column is linked to &string1) but the values of the column should equal the value of the 'ID' column in want7. How do I do this? Thanks in advance. I have only copied and pasted what I could write since I am relatively new to SAS.
%let string1 = ID1;
data want8; set want7;
/*Something like &string1 = ID*
run;
Using sashelp.class as an example (because it exists by default). Substitute as needed:
%let string1 = ID1;
data want8;
set sashelp.class;
&string1 = age ;
run;
This will reread the dataset. If you just want renames, look at the dataset option rename=. See SAS documentation: https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000695119.htm

I want to add auto_increment column in a table in SAS

I want to add a auto_Increment column in a table in SAS.Following code add's a column but not increment the value.
Thanks In Advance.
proc sql;
alter table pmt.W_cur_qtr_recoveries
add ID integer;
quit;
Wow, going to try for my second "SAS doesn't do that" answer this morning. Risky stuff.
A SAS dataset cannot define an auto-increment column. Whether you are creating a new dataset or inserting records into an existing dataset, you are responsible for creating any increment counters (ie they are just normal numeric vars where you have set the values to what you want).
That said, there are DATA step statements such as the sum statement (e.g. MyCounter+1) that make it easier to implement counters. If you describe more details of your problem, people could provide some alternatives.
The correct answer at this time is to create the ID yourself, BUT the discussion wouldn't be complete without mentioning that there is an unsupported SQL function Monotonic that can do what you want. It's not reliable, yet it persists.
The code pattern for its usage is
select monotonic() as ID, ....
Use the _N_ automatic variable in a data step like:
DATA TEMPLIB.my_dataset (label="my dataset with auto increment variables");
SET TEMPREP.my_dataset;
sas_incr_num = _N_; * add an auto increment 'sas_incr_num' variable;
sas_incr_cat = cat("AB.",cats(repeat("0",5-ceil(log10(sas_incr_num+1))),sas_incr_num),".YZ"); * auto increment the sas_incr_num variable and add 5 leading zeros and concatenate strings on either end;
LABEL
sas_incr_num="auto number each row"
sas_incr_cat="auto number each row, leading zeros, and add strings along for fun"
...
There is no such thing as an auto increment column in a SAS dataset. You can use a data step to create a new dataset that has the new variable. You can use the same name to have it replace the old one when done.
data pmt.W_cur_qtr_recoveries;
set pmt.W_cur_qtr_recoveries;
ID+1;
run;
It really depends on what your intended outcome is. But I have thrown together an example of how you may want to tackle this. it is a little rough, but gives you something to work from.
/*JUST SETTING UP THE DAY ONE DATA WITH AN ID ATTACHED
YOU WOULD MAKE THE FIRST RUN EXECUTE DIFFERENTLY TO SUBSEQUENT RUNS BY USING THE EXISTS FUNCTION AND MACRO LANGUAGE,
BUT I WILL LET YOU INVESTIGATE THIS FURTHER AS IT MAY BE IRRELEVANT.*/
DATA DAY1;
SET SASHELP.CLASS;
ID+1;
RUN;
/*ON DAY 2 WE ARE APPENDING ADDITIONAL RECORDS TO THE EXISTING DATASET*/
DATA DAY2;
/*APPEND DATASETS*/
SET DAY1 SASHELP.CLASS;
/*HOLD VALUE IN PROGRAM DATA VECTOR (PDV) UNTIL EXPLICITLY CHANGED*/
RETAIN _ID;
/*ADD VARIABLE _ID AND POPULATE WITH ID. IN DOING THIS THE LAST INSTANCE OF THE ID WILL BE HELD IN THE PDV FOR THE
FIRST OF THE NEW RECORDS*/
IF ID ~= . THEN _ID = ID;
/*INCREMENT THE VALUE IN _ID BY 1 AND DO SO FOR EACH RECORD ADDED*/
ELSE DO;
_ID+1;
END;
/*DROP THE ORIGINAL ID;*/
DROP ID;
/*RENAME _ID TO ID*/
RENAME _ID = ID;
RUN;
where "W_prv_qtr_recoveries" is a table Name and "pmt" is a library name.
Thanks to user2337871.
DATA pmt.W_prv_qtr_recoveries;
SET pmt.W_prv_qtr_recoveries;
RETAIN _ID;
IF ID ~= . THEN _ID = ID;
ELSE DO;
_ID+1;
END;
DROP ID;
RENAME _ID = ID;
RUN;
Assuming that this autoincrement column will be used for every record that is inserted.
We can accomplish the same as follows:-
We will first check the latest key in the dataset
PROC SQL;
SELECT MAX(KEY) INTO :MK FROM MYDATA;
QUIT;
%put KeyOld=&MK;
Then we increment this key
Data _NULL_;
call symput('KeyNew',&MK+1);
run;
%put KeyNew=&KeyNew;
Here we hold the New record that we want to insert, and add the correspoding key
Data TEMP1;
set TEMP;
Key=&KeyNew;
run;
Finally we load the new record in our dataset
PROC APPEND BASE=MYDATA DATA=TEMP1 FORCE;
RUN;

An efficient way to update multiple columns using join

I am really struggling with it guys.
The table needs to be updates has ~15M rows and ~200 columns.
I need to update few columns using a work table table.
This is (partly) what I need to do:
%macro condition;
%if &row_count>0 %then %do;
data _null_;
set W4TWGKJ6 end=final;
if _n_ = 1 then call execute("proc sql ;");
call execute
("update dds.insurance_policy set X_STORNO_BY_VERSION="||TOSNUM||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";,
update dds.insurance_policy set STATUS_CHANGE_DT="||ISSUE_DT||" where policy_no='"||cats(polid)||"' and X_INSURANCE_PRODUCT_CD='"||cats(prodid)||"'
and X_INSURER_SERIAL_NO = "||X_INSURER_SERIAL_NO||" and x_source_system_cd ="||'"5"'||" and x_source_system_category_cd ="||'"5"'||" and x_current_ind = "||'"Y"'||";");
if final then call execute('quit;'); run;
%end;
%mend;
%condition;
I first check if there are rows in table (&row_count)
if there are,
I update 2 columns (I need to update 5, I just cut them from the example)
using a work table called W4TWGKJ6.
This update takes forever.
In fact, I stopped the process every single time, as it worked for hours without returning anything....
Does anyone knows a better solution for this problem?
Thanks in advance,
Gal.
I'd suggest using MODIFY statement in datastep:
You should have same column names in both tables for BY variables and have them sorted by those variables.
data dds.insurance_policy;
modify
dds.insurance_policy
W4TWGKJ6 (keep= POLICY_NO X_INSURER_SERIAL_NO /* key variables */
X_STORNO_BY_VERSION STATUS_CHANGE_DT /* ... other variables from source to update target */
updatemode=nomissingcheck;
by POLICY_NO X_INSURER_SERIAL_NO;
if _iorc_ = %sysrc(_SOK) then do;
* Update row ;
replace;
end;
else _error_ = 0;
run;
See SAS: How not to overwrite a dataset when the "where" condition in a "Modify" statement does not hold? for complete reference of iorc return values.

SAS creating and populating dataset variables from macro variables

I have a group of data sets where certain variables have been defined as having lengths >2000 characters. What I want to do is create a macro that identifies these variables and then creates a set of new variables to hold the values.
doing this in base code would be something like:
data new_dset;
set old_dset:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
run;
I can build up the list of variable names and lengths as a set of macro variables, But I don't know how to create the new variables from the macro variables.
What I was thinking it would look like is:
%macro split;
data new_dset;
set old_dset;
%do i = 1%to &num_cols;
if &&collen&i > 2000 then do;
&&colnam&i 1 = substr(&&colnam&i,1,2000);
end;
%en;
run;
%mend;
I know that doesn't work, but that's the idea I have.
If anyone can help em work out how I can do this I would be very grateful.
Thanks
Bryan
Your macro doesn't need to be an entire data step. In this case it's helpful to see exactly what you're replicating and then write a macro based on that.
So your code is:
data new_dset;
set old_dset:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
run;
Your macro then really needs to be:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
So what you can do is put that in a macro:
%macro split(colname=);
length &colname._1 &colname._2 $2000;
&colname._1 = substr(&colname.,1,2000);
&colname._2 = substr(&colname.,2001,4000);
%mend;
Then you generate a list of calls:
proc sql;
select cats('%split(colname=',name,')') into :calllist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname='MYDATASET'
and length > 2000;
quit;
Then you run them:
data new_dset;
set old_dset;
&calllist;
run;
Now you're done :) &calllist contains a list of %split(colname) calls. If you may need more than 2 variables (ie, > 4000 length), you may want to add a new parameter 'length'; or if you're in 9.2 or newer you can just use SUBPAD instead of SUBSTR and generate all three variables for each outer variable.