Variable with tables's names SAS - sas

I want to create custom variable that will store table's name from which observation is comming from.
Something like this:
data FTTH_SOHO_2;
ATTRIB scoring_month;
set fmscore.SCORE_FTTH_CHURN_SOHO_202009 - fmscore.SCORE_FTTH_CHURN_SOHO_202012
fmscore.SCORE_FTTH_CHURN_SOHO_202101 - fmscore.SCORE_FTTH_CHURN_SOHO_202106 = tablename;
scoring_month = tablename;
where tp_desig_num = 'XXXXXXXXXXX';
run;
Ofcourse I get syntax error but is it possible to store the name of currety used Data Set to some kind variable and use it to mark with it observation from which it came from?
I need to see months from which I recive observations.

You're looking for the INDSNAME option.
data want;
set have1 have2 have3 indsname=dsn;
ds_name = dsn;
run;
You do have to create a variable with an assignment statement separate from the indsname option, as the option creates only a temporary variable.

Related

Create a column in SAS using a macro variable

I am trying to create a column using the string value of a macro variable in SAS.
I have a dataset called want7 which has a column called 'ID'. I want to create a new dataset called want8 with a new column called 'ID1' by dynamically linking it to &string1 (as in the name of the column is linked to &string1) but the values of the column should equal the value of the 'ID' column in want7. How do I do this? Thanks in advance. I have only copied and pasted what I could write since I am relatively new to SAS.
%let string1 = ID1;
data want8; set want7;
/*Something like &string1 = ID*
run;
Using sashelp.class as an example (because it exists by default). Substitute as needed:
%let string1 = ID1;
data want8;
set sashelp.class;
&string1 = age ;
run;
This will reread the dataset. If you just want renames, look at the dataset option rename=. See SAS documentation: https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000695119.htm

Converting sas numeric variable type in dataset to character type

I'm importing a csv to a sas dataset with this code:
PROC IMPORT
DATAFILE = '/folders/myshortcuts/SASsoftware_rialto_2015/providence_med_claims_15.csv'
OUT = medical
DBMS=DLM REPLACE;
DELIMITER='|';
getnames=yes;
run;
For the subsequent code it wants one of the fields called DIAGNOSIS_VERSION_CODE in this dataset to be a character type rather than numeric type which is the default. How can I change that default in the above code or convert the field in the dataset?
I tried this and it didn't work:
contents data=medical;
modify medical;
format DIAGNOSIS_VERSION_CODE $CHAR8.;
contents data=medical;
run;
You cannot use PROC DATASETS to change a variable's definition or values. You will need to create a new dataset. You can use the RENAME statement to make the new variable have the name of the old one.
data new_medical;
set medical ;
new_diagnosis_version_code = put(diagnosis_version_code,Z8.);
rename diagnosis_version_code=old_diagnosis_version_code
new_diagnosis_version_code=diagnosis_version_code
;
run;
To prevent this in the future you should write your own data step to read the data instead of asking PROC IMPORT to guess what data you have. Then you have control over how the variables are created.

I want to add auto_increment column in a table in SAS

I want to add a auto_Increment column in a table in SAS.Following code add's a column but not increment the value.
Thanks In Advance.
proc sql;
alter table pmt.W_cur_qtr_recoveries
add ID integer;
quit;
Wow, going to try for my second "SAS doesn't do that" answer this morning. Risky stuff.
A SAS dataset cannot define an auto-increment column. Whether you are creating a new dataset or inserting records into an existing dataset, you are responsible for creating any increment counters (ie they are just normal numeric vars where you have set the values to what you want).
That said, there are DATA step statements such as the sum statement (e.g. MyCounter+1) that make it easier to implement counters. If you describe more details of your problem, people could provide some alternatives.
The correct answer at this time is to create the ID yourself, BUT the discussion wouldn't be complete without mentioning that there is an unsupported SQL function Monotonic that can do what you want. It's not reliable, yet it persists.
The code pattern for its usage is
select monotonic() as ID, ....
Use the _N_ automatic variable in a data step like:
DATA TEMPLIB.my_dataset (label="my dataset with auto increment variables");
SET TEMPREP.my_dataset;
sas_incr_num = _N_; * add an auto increment 'sas_incr_num' variable;
sas_incr_cat = cat("AB.",cats(repeat("0",5-ceil(log10(sas_incr_num+1))),sas_incr_num),".YZ"); * auto increment the sas_incr_num variable and add 5 leading zeros and concatenate strings on either end;
LABEL
sas_incr_num="auto number each row"
sas_incr_cat="auto number each row, leading zeros, and add strings along for fun"
...
There is no such thing as an auto increment column in a SAS dataset. You can use a data step to create a new dataset that has the new variable. You can use the same name to have it replace the old one when done.
data pmt.W_cur_qtr_recoveries;
set pmt.W_cur_qtr_recoveries;
ID+1;
run;
It really depends on what your intended outcome is. But I have thrown together an example of how you may want to tackle this. it is a little rough, but gives you something to work from.
/*JUST SETTING UP THE DAY ONE DATA WITH AN ID ATTACHED
YOU WOULD MAKE THE FIRST RUN EXECUTE DIFFERENTLY TO SUBSEQUENT RUNS BY USING THE EXISTS FUNCTION AND MACRO LANGUAGE,
BUT I WILL LET YOU INVESTIGATE THIS FURTHER AS IT MAY BE IRRELEVANT.*/
DATA DAY1;
SET SASHELP.CLASS;
ID+1;
RUN;
/*ON DAY 2 WE ARE APPENDING ADDITIONAL RECORDS TO THE EXISTING DATASET*/
DATA DAY2;
/*APPEND DATASETS*/
SET DAY1 SASHELP.CLASS;
/*HOLD VALUE IN PROGRAM DATA VECTOR (PDV) UNTIL EXPLICITLY CHANGED*/
RETAIN _ID;
/*ADD VARIABLE _ID AND POPULATE WITH ID. IN DOING THIS THE LAST INSTANCE OF THE ID WILL BE HELD IN THE PDV FOR THE
FIRST OF THE NEW RECORDS*/
IF ID ~= . THEN _ID = ID;
/*INCREMENT THE VALUE IN _ID BY 1 AND DO SO FOR EACH RECORD ADDED*/
ELSE DO;
_ID+1;
END;
/*DROP THE ORIGINAL ID;*/
DROP ID;
/*RENAME _ID TO ID*/
RENAME _ID = ID;
RUN;
where "W_prv_qtr_recoveries" is a table Name and "pmt" is a library name.
Thanks to user2337871.
DATA pmt.W_prv_qtr_recoveries;
SET pmt.W_prv_qtr_recoveries;
RETAIN _ID;
IF ID ~= . THEN _ID = ID;
ELSE DO;
_ID+1;
END;
DROP ID;
RENAME _ID = ID;
RUN;
Assuming that this autoincrement column will be used for every record that is inserted.
We can accomplish the same as follows:-
We will first check the latest key in the dataset
PROC SQL;
SELECT MAX(KEY) INTO :MK FROM MYDATA;
QUIT;
%put KeyOld=&MK;
Then we increment this key
Data _NULL_;
call symput('KeyNew',&MK+1);
run;
%put KeyNew=&KeyNew;
Here we hold the New record that we want to insert, and add the correspoding key
Data TEMP1;
set TEMP;
Key=&KeyNew;
run;
Finally we load the new record in our dataset
PROC APPEND BASE=MYDATA DATA=TEMP1 FORCE;
RUN;

Drop a range of variables in SAS

I currently have a dataset with 200 variables. From those variables, I created 100 new variables. Now I would like to drop the original 200 variables. How can I do that?
Slightly better would be, how I can drop variables 3-200 in the new dataset.
sorry if I was vague in my question but basically I figured out I need to use --.
If my first variable is called first and my last variable is called last, I can drop all the variables inbetween with (drop= first--last);
Thanks for all the responses.
As with most SAS tasks, there are several alternatives. The easiest and safest way to drop variables from a SAS data set is with PROC SQL. Just list the variables by name, separated by a comma:
proc sql;
alter table MYSASDATA
drop name, age, address;
quit;
Altering the table with PROC SQL removes the variables from the data set in place.
Another technique is to recreate the data set using a DROP option:
data have;
set have(drop=name age address);
run;
And yet another way is using a DROP statement:
data have;
set have;
drop name age address;
run;
Lots of options - some 'safer', some less safe but easier to code. Let's imagine you have a dataset with variables ID, PLNT, and x1-x200 to start with.
data have;
id=0;
plnt=0;
array x[200];
do _t = 1 to dim(x);
x[_t]=0;
end;
run;
data want;
set have;
*... create new 100 variables ... ;
*option 1:
drop x1-x200; *this works when x1-x200 are numerically consecutive;
*option 2:
drop x1--x200; *this works when they are physically in order on the dataset -
only the first and last matter;
run;
*Or, do it this way. This would also work with SQL ALTER TABLE. This is
the safest way to do it.;
proc sql;
select name into :droplist separated by ' ' from dictionary.columns
where libname='WORK' and memname='HAVE' and name not in ('ID','PRNT');
quit;
proc datasets lib=work;
modify want;
drop &droplist.;
quit;
If all of the variables you want to drop are named so they all start the same (like old_var_1, old_var_2, ..., old_var_n), you could do this (note the colon in drop option):
data have;
set have(drop= old_var:);
run;
data want;
set have;
drop VAR1--VARx;
run;
Would love to know if you can do this by position.
Definitely works with variable names separated by double dash (--).
I have some macros that would allow this here
You could run that whole set of macros, or just run list_vars(), is_blank(), num_words, find_word, remove_word, remove_words , nth_word().
Using these it would be:
%let keep_vars = keep_this and_this also_this;
%let drop_vars = %list_vars(old_dataset);
%let drop_vars = %remove_words(&drop_vars , &keep_vars);
data new_dataset (drop = &drop_vars );
set old_dataset;
/*stuff happens*/
run;
This will keep the three variables keep_this and_this also_this but drop everything else in the old dataset.

SAS - Creating variables from macro variables

I have a SAS dataset which has 20 character variables, all of which are names (e.g. Adam, Bob, Cathy etc..)
I would like a dynamic code to create variables called Adam_ref, Bob_ref etc.. which will work even if there a different dataset with different names (i.e. don't want to manually define each variable).
So far my approach has been to use proc contents to get all variable names and then use a macro to create macro variables Adam_ref, Bob_ref etc..
How do I create actual variables within the dataset from here? Do I need a different approach?
proc contents data=work.names
out=contents noprint;
run;
proc sort data = contents; by varnum; run;
data contents1;
set contents;
Name_Ref = compress(Name||"_Ref");
call symput (NAME, NAME_Ref);
%put _user_;
run;
If you want to create an empty dataset that has variables named like some values you have in a macro variables you could do something like this.
Save the values into macro variables that are named by some pattern, like v1, v2 ...
proc sql;
select compress(Name||"_Ref") into :v1-:v20 from contents;
quit;
If you don't know how many values there are, you have to count them first, I assumed there are only 20 of them.
Then, if all your variables are character variables of length 100, you create a dataset like this:
%macro create_dataset;
data want;
length %do i=1 %to 20; &&v&i $100 %end;
;
stop;
run;
%mend;
%create_dataset; run;
This is how you can do it if you have the values in macro variable, there is probably a better way to do it in general.
If you don't want to create an empty dataset but only change the variable names, you can do it like this:
proc sql;
select name into :v1-:v20 from contents;
quit;
%macro rename_dataset;
data new_names;
set have(rename=(%do i=1 %to 20; &&v&i = &&v&i.._ref %end;));
run;
%mend;
%rename_dataset; run;
You can use PROC TRANSPOSE with an ID statement.
This step creates an example dataset:
data names;
harry="sally";
dick="gordon";
joe="schmoe";
run;
This step is essentially a copy of your step above that produces a dataset of column names. I will reuse the dataset namerefs throughout.
proc contents data=names out=namerefs noprint;
run;
This step adds the "_Refs" to the names defined before and drops everything else. The variable "name" comes from the column attributes of the dataset output by PROC CONTENTS.
data namerefs;
set namerefs (keep=name);
name=compress(name||"_Ref");
run;
This step produces an empty dataset with the desired columns. The variable "name" is again obtained by looking at column attributes. You might get a harmless warning in the GUI if you try to view the dataset, but you can otherwise use it as you wish and you can confirm that it has the desired output.
proc transpose out=namerefs(drop=_name_) data=namerefs;
id name;
run;
Here is another approach which requires less coding. It does not require running proc contents, does not require knowing the number of variables, nor creating a macro function. It also can be extended to do some additional things.
Step 1 is to use built-in dictionary views to get the desired variable names. The appropriate view for this is dictionary.columns, which has alias of sashelp.vcolumn. The dictionary libref can be used only in proc sql, while th sashelp alias can be used anywhere. I tend to use sashelp alias since I work in windows with DMS and can always interactively view the sashelp library.
proc sql;
select compress(Name||"_Ref") into :name_list
separated by ' '
from sashelp.vcolumn
where libname = 'WORK'
and memname = 'NAMES';
quit;
This produces a space delimited macro vaiable with the desired names.
Step 2 To build the empty data set then this code will work:
Data New ;
length &name_list ;
run ;
You can avoid assuming lengths or create populated dataset with new variable names by using a slightly more complicated select statement.
For example
select compress(Name)||"_Ref $")||compress(put(length,best.))
into :name_list
separated by ' '
will generate a macro variable which retains the previous length for each variable. This will work with no changes to step 2 above.
To create populated data set for use with rename dataset option, replace the select statement as follows:
select compress(Name)||"= "||compress(_Ref")
into :name_list
separated by ' '
Then replace the Step 2 code with the following:
Data New ;
set names (rename = ( &name_list)) ;
run ;