SAS set statement using colon and creating a filename variable - sas

So using SAS, I have a number of SAS monthend datasets named as follows:
mydata_201501
mydata_201602
mydata_201603
mydata_201604
mydata_201605
...
mydata_201612
Each has account information at particular monthend. I want to stack the datasets all into one dataset using colon rather than writing out the full set statement as follows:
data mynewdata;
set mydata_:;
run;
However there is no datestamp variable within the datasets so when I stack them I will lose the monthend information for each account. I want to know which line refers to which monthend for each account. Is there a way I can automatically create a variable that names the table the row come from. for example the long winded way would be this:
data mynewdata;
set mydata_201501 (in=a) mydata_201502 (in=b) mydata_201503 (in=c)...;
if a then tablename = 'mydata_201501';
if b then tablename = 'mydata_201502';
if c...
run;
but is there a quicker way using colon along these lines?
data mynewdata;
set mydata_:;
tablename = _tablelabel_;
run;
thanks

I always find clicking on comment links annoying, so hopefully here's the answer in your context. Use the INDSNAME= SET statement option to assign the dataset name to a variable:
data mynewdata;
set mydata_: indsname=_tablelabel_;
tablename = _tablelabel_;
run;
N.B. you can call _tablelabel_ whatever you want, and you may wish to change it so it doesn't look like a SAS generated variable name.
INDSNAME= only became a SAS SET statement option in version 9.2

Just to be clear, with my particular code, where the datasets were named mydata_yyyymm and I wanted a monthend variable with datestamp, I was able to produce this using the solution provided by mjsqu as follows (obs and keep statement provided if required):
data mynewdata;
set mydata_: (obs=100 keep=xxx xxx) indsname=_tablelabel_;
format monthend yymmdd10.;
monthend = input(scan(_tablelabel_,-1,'_'),yymmn6.);
run;

Related

How do i convert a date column using SAS Macro?

I am a SAS Developer. I currently have a script that will read from a table column that is in datetime format. In this script, it looks something like this:
data a; batch_dttm = '01Jan2011:00:00:00'dt; run;
proc sql; select batch_dttm format=16.0 into:batch_dttm from a; quit;
So when I assign it to macro variable, it is actually assigning the value of 2930485 into batch_dttm.
The problem is, when I want to resolve this &batch_dttm in another job at a later stage, i have to use this:
input(&batch_dttm,16.0)
to convert 2930485 into date.
I don't want to resolve in this way as this is the only Macro variable that has to be resolved with this input function. I want to assign 01Jan2011:00:00:00 (as text?) in PROC SQL INTO statement so that i dont have to use input conversion anymore.
I want to call &batch_dttm as datetime format in another script. I only want to resolve the datetime using "&BATCH_DTTM"dt instead of input(&batch_dttm,16.0). I believe there is a step to convert 01Jan2011:00:00:00 into text without changing it to 2930485. Is there anyway to do so?
How can I add 1 more step to make me resolve the macro in below script:
"&batch_dttm"dt
Why do you think you need to add the INPUT() function? You can just use the value of the macro variable to generate the number of seconds, 2930485 in your example, into your code.
SAS stores datetime values as a number of seconds, so these two expressions are the same:
where batch_dttm = 2930485 ;
where batch_dttm = '01JAN2011:00:00:00'dt ;
Which means you can just use code like this to use your original macro variable.
where batch_dttm = &batch_dttm ;
If you do need to have the human friendly text in the macro variable, perhaps to use in a title statement, then just change the format you use when creating the macro variable.
select batch_dttm format=datetime20. into :batch_dttm trimmed ...
...
title "Data as of &batch_dttm";
where batch_dttm = "&batch_dttm"dt ;
You can also use %sysfunc() to call PUTN() to change the existing number of seconds into that style if you want.
select batch_dttm format=32. into :batch_dttm trimmed ...
...
title "Data as of %sysfunc(putn(&batch_dttm,datetime20.))";
where batch_dttm = &batch_dttm ;
Like this?
proc sql;
select put(batch_dttm, datetime20.) into:batch_dttm from a;
quit;
%put &batch_dttm;

Assigning index to two concatenated tables in SAS?

I have two table with exactly the same column headers and one row each. I have the code to concatenate them which works fine.
data concatenation;
set CURR_CURR CURR_30;
run;
However, there is no index in the output to say which row corresponds to which table.
I've tried using 'create index' and 'index create' already but they don't work syntactically. Simply I'd just want to add a column of strings and move it to the front of all the other columns in the data set.
INDSNAME option on the SET statement + variable to store the information.
If you set the length statement ahead of your SET statement it will create it as the first column.
Just a note that this isn't the same as an 'index'. An index in SAS has a different meaning which isn't what you're trying to create here.
data concatenation;
length dset source $50.;
set CURR_CURR CURR_30 indsname=source;
dset=source;
run;
Reeza's answer is very similar to something I figured out that worked as well. Here's my version as an alternative.
data concatenation;
length id $ 10;
set CURR_CURR (in=a) CURR_30 (in=b);
if a then id = 'curr_curr';
else if b then id = 'curr_30';
run;

Use a macro variable to when saving to a permanent SAS data set

Trying to save data set Keepmerge as a permanent SAS data set called Oct15Tot using the code below. If I sub "&OutTabTot" for just Oct15Tot, it works. Trying to save myself from having to chang another bit of code further down (the %let is referenced at the beginning, and is used throughout my program. Thanks!
%let OutTabTot = Oct15Tot;
libname WorkItem "\\WRKGRP\CVOWB\SAS Data Sets";
data WorkItem."&OutTabTot";
set work.Keepmerge;
run;
Here's the error I'm getting:
22
201
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, /, ;,
_DATA_, _LAST_, _NULL_.
ERROR 201-322: The option is not recognized and will be ignored.
If you remove the quotes in your Data statement it should work, like so:
%let OutTabTot = Oct15Tot;
libname WorkItem "\\WRKGRP\CVOWB\SAS Data Sets";
Data WorkItem.&OutTabTot;
Set Work.Keepmerge;
Run;
In general, as cherry notes, you should just skip the quotations.
However, if you have reason to use quotations, you need to use an n afterwards to tell SAS to make this a name literal.
%let OutTabTot = Oct15 Tot;
options validmemname=extend;
libname WorkItem "\\WRKGRP\CVOWB\SAS Data Sets";
Data WorkItem."&OutTabTot"n;
Set Work.Keepmerge;
Run;
I don't recommend using things like dataset names with spaces if you can avoid it, as it's a pain... but it's legal, with options validmemname=extend set.

How can I write a DATA step that will drop all variables from the input dataset except the ones that I explicitly define within the dataset?

I want to generate a new SAS dataset using table foo as the input and one-to-one correspondence with records in the output dataset bar. I wand to drop variables from foo by default but I also require all of the fields of foo be available (to derive new variables) and also that some variables from foo to be kept (if explicitly indicated).
I'm currently managing an explicit list of variables to drop= but it results in long and unwieldy syntax in the data-set-option declaration.*
DATA bar (drop=id data_value2);
set foo;
new_id = id;
data_value1 = data_value1; /* Explicitly included for clarity */
new_derived_data_value = data_value2 * 2; /* etc. */
format new_id $fmt_id.
data_value1 $fmt_dat.
new_derived_data_value $fmt_ddat.
;
RUN;
The output table I want should have only fields data_value1, new_data and new_derived_data_value.
I'm looking for the most syntactically succinct way of reproducing the same effect as :
SELECT
id AS new_id
,data_value1
,data_value2 * 2 AS new_derived_data_value
FROM foo
How can I write a DATA step that will drop all variables from the input dataset except the ones that I explicitly define within the dataset?
* Update: I could use aaa--hhh type notatation but even this can be unwieldy if the ordering of the variables changes over time or I later decide I'd like to keep variable ddd.
I would store the variable names in a macro list, obtained from the DICTIONARY tables. You can then drop them all easily in a data step. e.g.
proc sql noprint;
select name into :vars separated by ' '
from dictionary.columns
where libname = 'SASHELP' and memname='CLASS';
quit;
data want (drop=&vars.);
set sashelp.class;
name1=name;
age1=age;
run;
Keith's solution is the best production solution, but a quick alternative assuming you know the first and last variables in the dataset:
data want;
set class;
drop name--weight;
name1=name;
age1=age;
run;

SAS - code to dynamically count columns and sum each of them

I’m wondering if someone can help with a coding problem I have.
Background – I have a project that imports some files and uses the data in those files to perform projections. The contents of the files determines some aspects of the size of the output that follows. Simply, values in data loaded in drives the size and shape of the tables that follow, and this can vary.
The following code is an example of the problem.
The data loaded will have a variable year start (note wf2009, 2009 is the first year) and variable range (this example goes from 2009 to 2030, but this will vary too).
proc summary data= labeled_proj_data_hc;
class jurisdiction specialty measure;
types jurisdiction*specialty*measure;
VAR wf2009--wf2030;
output out= sum_labeled_proj_data_hc
sum(wf2009) = y2009
sum(wf2010) = y2010
sum(wf2011) = y2011
sum(wf2012) = y2012;
run;
Where I’m not sure how to proceed is:
sum(wf2009) = y2009
sum(wf2010) = y2010
sum(wf2011) = y2011
sum(wf2012) = y2012;
In the sequence of lines calling for the sum of their respective columns, how can I make this dynamic so that the start year is populated from a variable and it increments yearly until the last year which is also variable.
Has anyone solved a similar problem.
Cheers,
Is renaming the variables necessary? If not then you can use the : wildcard operator to access all variables that begin with 'wf', then just put SUM= in the output statement, which will preserve the original names.
So your proc summary would look like this.
proc summary data= labeled_proj_data_hc;
class jurisdiction specialty measure;
types jurisdiction*specialty*measure;
VAR wf: ;
output out= sum_labeled_proj_data_hc
sum=;
run;