Suppose that I have the following list
proc sql;
select name into: list separated by ' ' from dataset;
quit;
and I want to keep the elements of this list in my dataset2. The datastep
data dataset2;
set dataset2;
if name in &list.;
run;
does not work. How should I modify the first proc sql so that the datastep makes sense? Thanks in advance!
This assumes you're looking to filter values within a column, not the number of columns.
If the value is numeric this is what you need:
data dataset3;
set dataset2;
if name not in (&list.);
run;
If the value is character then this is what you need:
proc sql;
select quote(name) into: list separated by ' ' from dataset;
quit;
data dataset3;
set dataset2;
if name not in (&list.);
run;
Related
I am working with a huge dataset in sas trying to use proc sql and I need help setting up a like statement. I'm trying to extract all the columns that have 'eco' in the name
I'm getting an error in the where statement as it is not registering the second *.
Any help?
proc sql
select *
from cfy19e8
where * LIKE %eco%;
You could concatenate all of your columns with catx() and find any one that has the word eco.
data have;
input col1$ col2$ col3$;
datalines;
sadfeco kdoa wrfs
asdf asdf sadf
mfecosa mawoeco mfzeco
;
run;
data want;
set have;
where catx('|', col1, col2, col3) LIKE '%eco%';
run;
If you have a lot of character columns, you could use the shortcut _CHARACTER_ to concatenate all variables, then use find() within an if statement in a data step.
data want;
set have;
if(find(catx('|', of _CHARACTER_), 'eco') );
run;
Perhaps
proc contents noprint data=cfy19e8 out=eco_columns(where=(upcase(name) like '%ECO%'));
run;
title 'Columns with ECO in their name';
proc print data=eco_columns;
var name;
run;
I'm kinda new to SAS.
I have 2 datasets: set1 and set2.
I'd like to get a list of variables that's in set2 but not in set1.
I know I can easily see them by doing proc compare and then listvar,
however, i wish to copy&paste the whole list of different variables instead of copying one by one from the report generated.
i want either a macro variable containing a list of all different variables separated by space, or printing out all variables in plain texts that I can easily copy everything.
proc contents data=set1 out=cols1;
proc contents data=set2 out=cols2;
data common;
merge cols1 (in=a) cols2 (in=b);
by name;
if not a and b;
keep name;
run;
proc sql;
select name into :commoncols separated by ','
from work.common;
quit;
Get the list of variable names and then compare the lists.
Conceptually the simplest way see what is in a dataset is to use proc contents.
proc contents data=set1 noprint out=content1 ; run;
proc contents data=set2 noprint out=content2 ; run;
Now you just need to find the names that are in one and not the other.
An easy way is with PROC SQL set operations.
proc sql ;
create table in1_not_in2 as
select name from content1
where upcase(name) not in (select upcase(name) from content2)
;
create table in2_not_in1 as
select name from content2
where upcase(name) not in (select upcase(name) from content1)
;
quit;
You could also push the lists into macro variables instead of datasets.
proc sql noprint ;
select name from content1
into :in1_not_in2 separated by ' '
where upcase(name) not in (select upcase(name) from content2)
;
select name from content2
into :in2_not_in1 separated by ' '
where upcase(name) not in (select upcase(name) from content1)
;
quit;
Then you could use the macro variables to generate other code.
data both;
set set1(drop=&in1_not_in2) set2(drop=&in2_not_in1) ;
run;
Issue:
I'm looking to reverse the order of all columns in a sas dataset. Should I achieve this by first transposing and then using a loop to reverse the order of the columns? This is my logic...
Step One:
data pre_transpose;
set sashelp.class;
*set &&dataset&i.. ;
_row_ + 1; * Unique identifier ;
length _charvar_ $20; * Create 1 character variable ;
run;
Step One Output:
Step Two: Do I Reverse Columns Here?
proc transpose data = pre_transpose out = middle (where = (lowcase(_name_) ne '_row_'));
by _row_;
var _all_;
quit;
Step Two Output:
EDIT:
I have attempted this:
/* use proc sql to create a macro variable for column names */
proc sql noprint;
select varnum, nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname = 'all_character'
order by varnum desc;
quit;
/* Use retain to maintain format */
data reverse_columns;
retain &varlist.;
set all_character;
run;
But I did not achieve the results I was looking for - the column order is not reversed.
You just need to get the list of variable names. One way is to use the metadata. Do if your dataset is member HAVE in libref WORK then you could use this to get the list of variable names into a single macro variable.
proc sql noprint;
select varnum , nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='HAVE'
order by varnum desc
;
quit;
You could then use the macro variable in a data step like this.
data want ;
retain &varlist ;
set have ;
run;
Note that the value of libname and memname in DICTIONARY.COLUMNS is in uppercase only.
I have an excel file that needs to be imported periodically to sas. The names of the columns are in row 2 and the number of columns can change. I'm using the following query:
proc import file = "file.xlsx"
out = sasfile
dbms= excel replace;
sheet = "sheet1";
range = "sheet1$A2:BE2000";
getnames = yes;
run;
However, I keep getting F variables in the sas output. How can I dynamically input only the columns that have names?
Are you saying that if the column doesn't have a name in the second row then you want to remove that column from the resulting table?
It is a bit of a pain to get PROC IMPORT to read an XLSX file that is not formatted as a table since it does not support NAMEROW, STARTROW, DATAROW, etc. But you might be able to do it by just reading the names and the data separately.
First let's create some macro variables to make the solution easy to modify.
%let sheetname=SHEET1;
%let startrow=2;
%let lastrow=2000;
%let startcol=A;
%let lastcol=BE;
Now let's read in the variable names from &STARTROW.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=names1;
range="&sheetname.$&startcol.&startrow:&lastcol.&startrow";
getnames=no;
run;
And then transpose it.
proc transpose data=names1 out=names2;
var _all_;
run;
Now let's generate old=new pairs for the columns we want to rename and also the list of columns that we want to drop.
proc sql noprint ;
select case when col1 ne ' ' then catx('=',_name_,nliteral(trim(col1))) else ' ' end
, case when col1 ne ' ' then ' ' else _name_ end
into :rename separated by ' '
, :drop separated by ' '
from names2
;
quit;
Now let's read in the data and add dataset options to rename and/or drop columns on the way out.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=want(rename=(&rename) drop=&drop)
;
range="&sheetname.$&startcol.%eval(&startrow+1):&lastcol.&lastrow";
getnames=no;
run;
I think you are getting those because you are explicitly giving sheet and range just made a simple file and did import as expected with sas code given below
PROC IMPORT OUT= WORK.imported_file DATAFILE= "file.xlsx"
DBMS=EXCEL REPLACE;
GETNAMES=YES;
RUN;
If you are trying to start from a certain row you can achieve that using
namerow=2;
startrow=3;
I don't think there's an easy way to prevent proc import from creating the named F variables. But it's not hard to remove them after the import.
First, create a macro variable containing the F vars. I've chosen to use the dictionary.columns table to find variables that begin with "F" and only contain digits from the 2nd position to the end of name. You don't want to drop variables with names such as "flag", "F12_23" or "f2var".
* imported table in work.xl;
proc sql noprint;
select name into :fvars separated by ', '
from dictionary.columns
where
libname = 'WORK' and
memname = 'XL' and
name like 'F%' and
notdigit(strip(name), 2) = 0
;
quit;
Then use alter table to drop the variables.
proc sql;
alter table xl
drop &fvars;
quit;
It's pretty straight-forward.
Suppose you have a data set such as the following:
FSA1 SSA1 SBW1
1 2 3
Is there a way in a data step to filter columns that do not contain 'SA'? I don't want to use a drop or a keep statement as the real dataset has hundreds of variables.
Something like this:
proc sql;
select name into: dropnames
separated by " "
from dictionary.columns
where libname='SASHELP' and memname='CLASS'
having name contains 'He';
quit;
data class;
set sashelp.class;
drop &dropnames;
run;