join tables from a library - sas

hi am trying to append the data-sets from a library which contain a specific column variable in them.for example i want to append those data-sets which contain the name column in them from myfile library.
below is my sample code--->
libname myfile'\c:data';
proc sql noprint ;
select distinct catx(".",libname,memname) into :DataList separated by " "
from dictionary.columns
where libname = upcase(myfile) and upcase(name);
quit;

Assuming that the type of the variable is consistent across all datasets something as simple as SET will work:
Data want;
Set &datalist;
Run;

Related

sas: how to create a variable containing a list of different variables in 2 datasets

I'm kinda new to SAS.
I have 2 datasets: set1 and set2.
I'd like to get a list of variables that's in set2 but not in set1.
I know I can easily see them by doing proc compare and then listvar,
however, i wish to copy&paste the whole list of different variables instead of copying one by one from the report generated.
i want either a macro variable containing a list of all different variables separated by space, or printing out all variables in plain texts that I can easily copy everything.
proc contents data=set1 out=cols1;
proc contents data=set2 out=cols2;
data common;
merge cols1 (in=a) cols2 (in=b);
by name;
if not a and b;
keep name;
run;
proc sql;
select name into :commoncols separated by ','
from work.common;
quit;
Get the list of variable names and then compare the lists.
Conceptually the simplest way see what is in a dataset is to use proc contents.
proc contents data=set1 noprint out=content1 ; run;
proc contents data=set2 noprint out=content2 ; run;
Now you just need to find the names that are in one and not the other.
An easy way is with PROC SQL set operations.
proc sql ;
create table in1_not_in2 as
select name from content1
where upcase(name) not in (select upcase(name) from content2)
;
create table in2_not_in1 as
select name from content2
where upcase(name) not in (select upcase(name) from content1)
;
quit;
You could also push the lists into macro variables instead of datasets.
proc sql noprint ;
select name from content1
into :in1_not_in2 separated by ' '
where upcase(name) not in (select upcase(name) from content2)
;
select name from content2
into :in2_not_in1 separated by ' '
where upcase(name) not in (select upcase(name) from content1)
;
quit;
Then you could use the macro variables to generate other code.
data both;
set set1(drop=&in1_not_in2) set2(drop=&in2_not_in1) ;
run;

SAS import xls only when columns have a name

I have an excel file that needs to be imported periodically to sas. The names of the columns are in row 2 and the number of columns can change. I'm using the following query:
proc import file = "file.xlsx"
out = sasfile
dbms= excel replace;
sheet = "sheet1";
range = "sheet1$A2:BE2000";
getnames = yes;
run;
However, I keep getting F variables in the sas output. How can I dynamically input only the columns that have names?
Are you saying that if the column doesn't have a name in the second row then you want to remove that column from the resulting table?
It is a bit of a pain to get PROC IMPORT to read an XLSX file that is not formatted as a table since it does not support NAMEROW, STARTROW, DATAROW, etc. But you might be able to do it by just reading the names and the data separately.
First let's create some macro variables to make the solution easy to modify.
%let sheetname=SHEET1;
%let startrow=2;
%let lastrow=2000;
%let startcol=A;
%let lastcol=BE;
Now let's read in the variable names from &STARTROW.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=names1;
range="&sheetname.$&startcol.&startrow:&lastcol.&startrow";
getnames=no;
run;
And then transpose it.
proc transpose data=names1 out=names2;
var _all_;
run;
Now let's generate old=new pairs for the columns we want to rename and also the list of columns that we want to drop.
proc sql noprint ;
select case when col1 ne ' ' then catx('=',_name_,nliteral(trim(col1))) else ' ' end
, case when col1 ne ' ' then ' ' else _name_ end
into :rename separated by ' '
, :drop separated by ' '
from names2
;
quit;
Now let's read in the data and add dataset options to rename and/or drop columns on the way out.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=want(rename=(&rename) drop=&drop)
;
range="&sheetname.$&startcol.%eval(&startrow+1):&lastcol.&lastrow";
getnames=no;
run;
I think you are getting those because you are explicitly giving sheet and range just made a simple file and did import as expected with sas code given below
PROC IMPORT OUT= WORK.imported_file DATAFILE= "file.xlsx"
DBMS=EXCEL REPLACE;
GETNAMES=YES;
RUN;
If you are trying to start from a certain row you can achieve that using
namerow=2;
startrow=3;
I don't think there's an easy way to prevent proc import from creating the named F variables. But it's not hard to remove them after the import.
First, create a macro variable containing the F vars. I've chosen to use the dictionary.columns table to find variables that begin with "F" and only contain digits from the 2nd position to the end of name. You don't want to drop variables with names such as "flag", "F12_23" or "f2var".
* imported table in work.xl;
proc sql noprint;
select name into :fvars separated by ', '
from dictionary.columns
where
libname = 'WORK' and
memname = 'XL' and
name like 'F%' and
notdigit(strip(name), 2) = 0
;
quit;
Then use alter table to drop the variables.
proc sql;
alter table xl
drop &fvars;
quit;
It's pretty straight-forward.

How to filter columns in SAS

Suppose you have a data set such as the following:
FSA1 SSA1 SBW1
1 2 3
Is there a way in a data step to filter columns that do not contain 'SA'? I don't want to use a drop or a keep statement as the real dataset has hundreds of variables.
Something like this:
proc sql;
select name into: dropnames
separated by " "
from dictionary.columns
where libname='SASHELP' and memname='CLASS'
having name contains 'He';
quit;
data class;
set sashelp.class;
drop &dropnames;
run;

exporting in SAS multiple datasets

I have a set of datasets which I would like to output in an Excel File. Is there a way to do this quickly rather than calling proc export each time for each dataset
%let MyDS = ('out.Ids', 'out.Vars', 'out.Places')
%let MyDSname = (Ids, Vars, Places)
I would like to create a macro that check if each dataset exists and then output to an Excel spreadsheet with the Tab name as specified in the corresponding MyDSname ...
Something like... %macro Out(MySpreadsheetName, MyDS, MyDSname);
Thanks very much for your help
Assuming you have Access to PC Files licensed, the easiest way to do this (export several datasets to one workbook) is with a libname.
libname mywbk excel 'c:\pathtomyexcel\excelfile.xlsx';
data mywbk.nameoftab;
set dataset;
run;
In terms of conditionally creating this, you should look at how you're arriving at the list of names to export. In general, you should have a dataset that contains one row per dataset to export, and two columns - DS name and tab name. You can then merge that to sashelp.vtables or dictionary.tables which are views containing the list of tables in the current SAS session; memname is the name of the table, libname is the name of the library. Then create a macro call from that:
proc sql;
select cats('%out(',dsname,',',tabname,')') into :calllist separated by ' '
from joinedds;
quit;
libname ... ;
&calllist.;

Merge multiple tables in sas using loop or macro

I have generated the tables of Forecast_2013 to 2022 in a loop and then merged all datasets in to 1 Table. But now I want to do merge the datasets in a loop with irrespective of years, The next year will be 2023 or 2024...I dont want to do mannually to set Forecast_2023;set forecast_2024. How can I put in to loop using macro?
Data P_OT.FORECAST(DROP=td qq AGE1 AGE2 AGE3 AGE4 AGEBANDFCST020 AGEBANDFCST030 AGEBANDFCST035P
HSI1_2012 HSI1_2013 HSI1_2014 HSI1_2015 HSI1_2016 HSI1_2017 HSI1_2018 HSI1_2019 HSI1_2020 HSI1_2021 HSI1_2022);
set FORECAST_2013;set FORECAST_2014;set FORECAST_2015;set FORECAST_2016;
set FORECAST_2017;set FORECAST_2018;set FORECAST_2019;set FORECAST_2020;
set FORECAST_2021;set FORECAST_2022;
run;
An alternative to what Scott posted would be:
*Assign library to folder where FORECAST_ files are located;
libname NAME 'C:\Path to Folder';
*Data step to stack files;
Data P_OT.FORECAST(DROP=td qq AGE1 AGE2 AGE3 AGE4 AGEBANDFCST020
AGEBANDFCST030 AGEBANDFCST035P HSI1_:);
set NAME.FORECAST_:;
run;
This should give the same results as what Scott posted using name prefix lists instead of using SQL to produce lists of datasets to be merged and variables to be dropped.
The code above will stack all datasets in the libname library that start with "FORECAST_". It will also drop all variables in the created dataset that begin with "HSI1_".
You can use proc sql and the sashelp.vcolumn table to find the names of all your tables and create a macro variable containing them. To do this all your tables need to be in the same library, and any other tables in the library can not contain FORCAST_. The part of the sql code that says memname like '%FORECAST_%' is doing a search on all the tables in the library called libname to select the tables that contain FORCAST_. The sql step creates a list of your tables that you can then inject into your data step to stack them.
Again: Be careful there are not other tables with the name like FORECAST_ or it will try to stack tables you do not want. The easiest way to ensure this would be to put them in their own library when you create these tables.
If you have all these tables in the work library then replace libname with work
I'm on my phone and haven't checked the substr and index part, but if i recall correctly that should work.
proc sql noprint;
select "libname."||memname
into :stack_tables separated by ' '
from sashelp.vcolumn
where libname = upper("libname")
and
memname like '%FORECAST_%'
;
select "HSI1_"||substr(memname,index(memname,"_")+1,4)
into :drop_vars separated by ' '
from sashelp.vcolumn
where libname = upper("libname")
and
memname like '%FORECAST_%'
;
quit;
Data P_OT.FORECAST(DROP=td qq AGE1 AGE2 AGE3 AGE4 AGEBANDFCST020 AGEBANDFCST030 AGEBANDFCST035P
&drop_vars.);
set &stack_tables.;
run;