I have an excel file that needs to be imported periodically to sas. The names of the columns are in row 2 and the number of columns can change. I'm using the following query:
proc import file = "file.xlsx"
out = sasfile
dbms= excel replace;
sheet = "sheet1";
range = "sheet1$A2:BE2000";
getnames = yes;
run;
However, I keep getting F variables in the sas output. How can I dynamically input only the columns that have names?
Are you saying that if the column doesn't have a name in the second row then you want to remove that column from the resulting table?
It is a bit of a pain to get PROC IMPORT to read an XLSX file that is not formatted as a table since it does not support NAMEROW, STARTROW, DATAROW, etc. But you might be able to do it by just reading the names and the data separately.
First let's create some macro variables to make the solution easy to modify.
%let sheetname=SHEET1;
%let startrow=2;
%let lastrow=2000;
%let startcol=A;
%let lastcol=BE;
Now let's read in the variable names from &STARTROW.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=names1;
range="&sheetname.$&startcol.&startrow:&lastcol.&startrow";
getnames=no;
run;
And then transpose it.
proc transpose data=names1 out=names2;
var _all_;
run;
Now let's generate old=new pairs for the columns we want to rename and also the list of columns that we want to drop.
proc sql noprint ;
select case when col1 ne ' ' then catx('=',_name_,nliteral(trim(col1))) else ' ' end
, case when col1 ne ' ' then ' ' else _name_ end
into :rename separated by ' '
, :drop separated by ' '
from names2
;
quit;
Now let's read in the data and add dataset options to rename and/or drop columns on the way out.
proc import datafile='c:\users\abernathyt\downloads\book1.xlsx' replace
dbms=xlsx out=want(rename=(&rename) drop=&drop)
;
range="&sheetname.$&startcol.%eval(&startrow+1):&lastcol.&lastrow";
getnames=no;
run;
I think you are getting those because you are explicitly giving sheet and range just made a simple file and did import as expected with sas code given below
PROC IMPORT OUT= WORK.imported_file DATAFILE= "file.xlsx"
DBMS=EXCEL REPLACE;
GETNAMES=YES;
RUN;
If you are trying to start from a certain row you can achieve that using
namerow=2;
startrow=3;
I don't think there's an easy way to prevent proc import from creating the named F variables. But it's not hard to remove them after the import.
First, create a macro variable containing the F vars. I've chosen to use the dictionary.columns table to find variables that begin with "F" and only contain digits from the 2nd position to the end of name. You don't want to drop variables with names such as "flag", "F12_23" or "f2var".
* imported table in work.xl;
proc sql noprint;
select name into :fvars separated by ', '
from dictionary.columns
where
libname = 'WORK' and
memname = 'XL' and
name like 'F%' and
notdigit(strip(name), 2) = 0
;
quit;
Then use alter table to drop the variables.
proc sql;
alter table xl
drop &fvars;
quit;
It's pretty straight-forward.
Related
I am trying to export a dataset in my Library/Work. It shows normal in SAS. However when I export the data as CSV or txt file (either from right click -> export, or use SAS code), the last few column names were missing (showing empty in CSV), while the values were kept. The column names missing are all in the format of "Log_xxx" but some the same-format columns were exported correctly. There're around 4000+ columns in my dataset.
The code I've tried is like:
proc export data=logdata
outfile="path.csv"
dbms=csv
replace;
run;
I've exported many datasets before, but it's the first time I have this kind of problem. I've tried to restart SAS and it's still not working.
I simply wanted to export the whole dataset completely with all column names and values.
Do you have any ideas?
I don't think it is PROC EXPORT that is the issue. You have to tell SAS that you want to write lines that are longer then 32,767 bytes (the default setting for the LRECL option).
This code works:
data test;
array longname [3500] ;
run;
filename csv temp lrecl=1000000 ;
proc export data=test dbms=csv file=csv ;
run;
So change your code to set the LRECL long enough for all of the variable names.
filename csv "path.csv" lrecl=1000000 ;
proc export data=logdata
outfile=csv
dbms=csv
replace
;
run;
Based on this post, your header is likely exceeding 32k characters, which causes the issues.
Solution is to manually create the file without proc export, or proc export to XLSX doesn't appear to have the issue.
*Create demo data;
data class;
set sashelp.class;
label age='Age, Years' weight = 'Weight(lbs)' height='Height, inches';
run;
proc sql noprint;
create table temp as
select name as _name_, label as _label_
from dictionary.columns
where libname="WORK" and upcase(memname)="CLASS";
select nliteral(name) into :varList separated by ' '
from dictionary.columns
where libname="WORK" and upcase(memname)="CLASS";
quit;
data _null_;
file "&sasforum.\datasets\TwoLinesHeader.csv" dsd lrecl = 40000;
set class;
if _n_ = 1 then do;
do until(eof);
set temp end=eof;
put _name_ #;
end;
put;
end;
put (&varList) (:);
run;
Suppose that I have the following list
proc sql;
select name into: list separated by ' ' from dataset;
quit;
and I want to keep the elements of this list in my dataset2. The datastep
data dataset2;
set dataset2;
if name in &list.;
run;
does not work. How should I modify the first proc sql so that the datastep makes sense? Thanks in advance!
This assumes you're looking to filter values within a column, not the number of columns.
If the value is numeric this is what you need:
data dataset3;
set dataset2;
if name not in (&list.);
run;
If the value is character then this is what you need:
proc sql;
select quote(name) into: list separated by ' ' from dataset;
quit;
data dataset3;
set dataset2;
if name not in (&list.);
run;
Issue:
I'm looking to reverse the order of all columns in a sas dataset. Should I achieve this by first transposing and then using a loop to reverse the order of the columns? This is my logic...
Step One:
data pre_transpose;
set sashelp.class;
*set &&dataset&i.. ;
_row_ + 1; * Unique identifier ;
length _charvar_ $20; * Create 1 character variable ;
run;
Step One Output:
Step Two: Do I Reverse Columns Here?
proc transpose data = pre_transpose out = middle (where = (lowcase(_name_) ne '_row_'));
by _row_;
var _all_;
quit;
Step Two Output:
EDIT:
I have attempted this:
/* use proc sql to create a macro variable for column names */
proc sql noprint;
select varnum, nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname = 'all_character'
order by varnum desc;
quit;
/* Use retain to maintain format */
data reverse_columns;
retain &varlist.;
set all_character;
run;
But I did not achieve the results I was looking for - the column order is not reversed.
You just need to get the list of variable names. One way is to use the metadata. Do if your dataset is member HAVE in libref WORK then you could use this to get the list of variable names into a single macro variable.
proc sql noprint;
select varnum , nliteral(name)
into :varlist, :varlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='HAVE'
order by varnum desc
;
quit;
You could then use the macro variable in a data step like this.
data want ;
retain &varlist ;
set have ;
run;
Note that the value of libname and memname in DICTIONARY.COLUMNS is in uppercase only.
I am trying to write a PROC SQL query in SAS to determine maximum of many columns starting with a particular letter (say RF*). The existing proc means statement which i have goes like this.
proc means data = input_table nway noprint missing;
var age x y z RF: ST: ;
class a b c;
output out = output_table (drop = _type_ _freq_) max=;
run;
Where the columns RF: refers to all columns starting with RF and likewise for ST. I was wondering if there is something similar in PROC SQL, which i can use?
Thanks!
Dynamic SQL is indeed the way to go with this, if you must use SQL. The good news is that you can do it all in one proc sql call using only one macro variable, e.g.:
proc sql noprint;
select catx(' ','max(',name,') as',name) into :MAX_LIST separated by ','
from dictionary.columns
where libname = 'SASHELP'
and memname = 'CLASS'
and type = 'num'
/*eq: is not available in proc sql in my version of SAS, but we can use substr to match partial variable names*/
and upcase(substr(name,1,1)) in ('A','W') /*Match all numeric vars that have names starting with A or W*/
;
create table want as select SEX, &MAX_LIST
from sashelp.class
group by SEX;
quit;
hi am trying to append the data-sets from a library which contain a specific column variable in them.for example i want to append those data-sets which contain the name column in them from myfile library.
below is my sample code--->
libname myfile'\c:data';
proc sql noprint ;
select distinct catx(".",libname,memname) into :DataList separated by " "
from dictionary.columns
where libname = upcase(myfile) and upcase(name);
quit;
Assuming that the type of the variable is consistent across all datasets something as simple as SET will work:
Data want;
Set &datalist;
Run;