I noticed in the SAS log that when I call a proc export data=mydata outfile="csv.csv" dbms=csv replace; run;, I get a generated internal set which declares a comma data format: comma20.3.
138 format YEAR best12. ;
145 format RATE_SPREAD comma20.3 ;
How can I get proc export not to do this, and to export without comma separators? Eg 9000 instead of 9,000?
Unfortunately PROC EXPORT does not support the FORMAT statement.
You could make a view to the original data with the format removed and export that.
data for_export / view=for_export;
set mydata;
format rate_spread ;
run;
proc export data=for_export outfile="csv.csv" dbms=csv replace;
run;
But you really don't need to use PROC EXPORT to write a CSV file. A data step works just as well. You might have to do a little work to add the header row.
proc transpose data=mydata(obs=0) out=names ;
var _all_;
run;
data _null_;
file "csv.csv" dsd ;
set names;
put _name_ #;
run;
data _null_;
file "csv.csv" dsd mod ;
set mydata;
put (_all_) (+0);
format rate_spread ;
run;
Related
I am trying to export a dataset in my Library/Work. It shows normal in SAS. However when I export the data as CSV or txt file (either from right click -> export, or use SAS code), the last few column names were missing (showing empty in CSV), while the values were kept. The column names missing are all in the format of "Log_xxx" but some the same-format columns were exported correctly. There're around 4000+ columns in my dataset.
The code I've tried is like:
proc export data=logdata
outfile="path.csv"
dbms=csv
replace;
run;
I've exported many datasets before, but it's the first time I have this kind of problem. I've tried to restart SAS and it's still not working.
I simply wanted to export the whole dataset completely with all column names and values.
Do you have any ideas?
I don't think it is PROC EXPORT that is the issue. You have to tell SAS that you want to write lines that are longer then 32,767 bytes (the default setting for the LRECL option).
This code works:
data test;
array longname [3500] ;
run;
filename csv temp lrecl=1000000 ;
proc export data=test dbms=csv file=csv ;
run;
So change your code to set the LRECL long enough for all of the variable names.
filename csv "path.csv" lrecl=1000000 ;
proc export data=logdata
outfile=csv
dbms=csv
replace
;
run;
Based on this post, your header is likely exceeding 32k characters, which causes the issues.
Solution is to manually create the file without proc export, or proc export to XLSX doesn't appear to have the issue.
*Create demo data;
data class;
set sashelp.class;
label age='Age, Years' weight = 'Weight(lbs)' height='Height, inches';
run;
proc sql noprint;
create table temp as
select name as _name_, label as _label_
from dictionary.columns
where libname="WORK" and upcase(memname)="CLASS";
select nliteral(name) into :varList separated by ' '
from dictionary.columns
where libname="WORK" and upcase(memname)="CLASS";
quit;
data _null_;
file "&sasforum.\datasets\TwoLinesHeader.csv" dsd lrecl = 40000;
set class;
if _n_ = 1 then do;
do until(eof);
set temp end=eof;
put _name_ #;
end;
put;
end;
put (&varList) (:);
run;
Is there a method to make the first delimiter in an observation different to the rest? In Microsoft SQL Server Integration Services (SSIS), there is an option to set the delimiter per column. I wonder if there is a similar way to achieve this in SAS with an amendment to the below code, whereby the first delimiter would be tab instead and the rest pipe:
proc export
dbms=csv
data=mydata.dataset1
outfile="E:\OutPutFile_%sysfunc(putn("&sysdate9"d,yymmdd10.)).txt"
replace
label;
delimiter='|';
run;
For example
From:
var1|var2|var3|var4
to
var1 var2|var3|var4
...Where the large space between var1 and var2 is a tab.
Many thanks in advance.
Sounds like you just want to make a new variable that has the first two variables combined and then write that out using tab delimiter.
data fix ;
length new1 $50 ;
set have ;
new1=catx('09'x,var1,var2);
drop var1 var2 ;
run;
proc export data=fix ... delimiter='|' ...
Note that you can reference a variable in the DLM= option on the FILE statement in a data step.
data _null_;
dlm='09'x ;
file 'outfile.txt' dsd dlm=dlm ;
set have ;
put var1 # ;
dlm='|' ;
put var2-var4 ;
run;
Or you could use the catx() trick in a data _null step. You also might want to use vvalue() function to insure formats are applied.
data _null_;
length newvar $200;
file 'outfile.txt' dsd dlm='|' ;
set have ;
newvar = catx('09'x,vvalue(var1),vvalue(var2));
put newvar var3-var4 ;
run;
Updated Fixed order of delimiters to match question.
Final code based on the marked answer by Tom:
data _null_;
dlm='09'x ;
file "E:\outputfile_%sysfunc(putn("&sysdate9"d,yymmdd10.)).txt" dsd dlm=dlm ;
set work.have;
put
var1 # ;
dlm='|';
put var2 var3 var4;
run;
I ´have a dataset with formats attached to it and I dont want to remove the formats from the dataset and when I use proc freq or proc print, I want the original values and not the formats attached.
Proc print data=mylib.data;
run;
is there any format=no option?
proc freq data=mylib.data;
tables gender;
format?????
run;
You can remove a format by specifying a null format on the PROC strep:
proc freq data=mylib.data ;
tables gender ;
format _ALL_ ;
run ;
_ALL_ is a list of all variables in the dataset.
I'm trying to export the column names of a sas data to a xlsx file but need the data to be copied starting in the 2nd row of the excel file. What I have right now:
PROC EXPORT DATA= mylib.test
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
sheet = "test1";
range = "test1$A2:BE2000";
run;
However, I get an error indicating that the RANGE statement is not supported and is ignored in Export procedure
Any suggestions?
Try the data set option FIRSTOBS.
PROC EXPORT DATA= mylib.test (firstobs=2)
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
run;
Edit: If by"starting in the 2nd row" you mean to output the data without the variable names, then you have to use PUTNAMES=NO;
PROC EXPORT DATA= mylib.test
outfile = "exceltobemodified.xlsx"
dbms = excel replace;
PUTNAMES=NO;
run;
Load your table with a blank row as first row. Try writing the table to excel file then. It should work.
Proc sql
insert into test
values('',.,'')
quit;
Proc sort data=test;
by _all_;
run;
Options missing='';
proc export data=test outfile='/home/libname/new.xlsx'
dbms=excel replace;
putnames=no;
run;
I'm importing CSV data in the following format:
SEDOL,12/08/2009,13/08/2009,14/08/2009,17/08/2009,18/08/2009
B1YVN39,7.8431,7.8431,7.8431,7.8431,7.598
B00G7R3,3.8,3.61,3.81,3.81,3.81
2965237,4.5351,4.5351,4.5351,4.5351,4.5351
2554345,7.355,7.355,7.355,7.355,7.355
I'm using the following command:
PROC IMPORT OUT= want
DATAFILE= have
DBMS=CSV REPLACE;
RUN;
Then transposing the data to long format, as follows:
PROC SORT DATA=want OUT=want; BY SEDOL;RUN;
proc transpose data=want out=transp;
by SEDOL;
run;
proc print; run;
How can I import the dates correctly formatted and change the variable type from default to date?
Importing and transposing are handy procedures, but if you understand your data well, a little data step program can deal with this in one step:
data want(keep=sedol v_date v_value);
infile have dsd dlm=',' truncover;
informat sedol $8. d1-d50 ddmmyy10. v1-v50 8.;
format v_date yymmdd10.;
array d(50) d1-d50;
array v(50) v1-v50;
/* Retain the date values and the count of dates */
retain d1-d50 idx;
/* Read header */
if _n_ = 1 then do;
input sedol d1-d50;
/* loop to find how many date columns there are */
do idx=1 to 50 while(d(idx) ne .);
end;
idx = idx - 1; /* must subtract one here */
delete;
end;
/* Read data lines */
input sedol v1-v50;
do i=1 to idx;
v_date = d(i);
v_value = v(i);
output;
end;
run;
As long as your input file is exactly as you describe (a header record with a leading ID variable less than 8 characters followed by some number of date values representing columns), this will process up to 50 measurements. It should be easy enough to modify if your needs change.
I would suggest in this case importing separately data and headers.
First, we import data:
PROC IMPORT OUT= want
DATAFILE= "C:\have.csv"
DBMS=CSV REPLACE;
getnames=no;
datarow=2;
RUN;
Then we import only the first row with variables' names:
options obs=1;
PROC IMPORT OUT= header
DATAFILE= "C:\have.csv"
DBMS=CSV REPLACE;
getnames=no;
RUN;
options obs=max;
Then we transpose row with headers into column and "mask" illegal (as SAS-names) values - add letter (doesn't matter which one, I chose 'D') as the first character and replace all slashes '/' to underscores '_':
proc transpose data=header out=header(drop=_name_);var _all_;run;
data header;
set header;
if anydigit(substr(COL1,1,1)) then COL1=cats("D",COL1);
COL1=translate(COL1,"_","/");
run;
Put this new 'cleaned' column names into a macrovariable:
proc sql noprint;
select COL1 into :names separated by ' '
from header;
quit;
And generate DATA-step for renaming using CALL EXECUTE routine:
data _null_;
dsid=open("want","i");
num=attrn(dsid,"nvars");
call execute("data want;");
call execute("set want;");
call execute("rename");
do i=1 to num;
call execute(varname(dsid,i)||"="||scan("&names",i," "));
end;
call execute(";run;");
rc=close(dsid);
run;
Now your original SORT and TRANSPOSE:
PROC SORT DATA=want OUT=want; BY SEDOL;RUN;
proc transpose data=want out=transp;
by SEDOL;
run;
And at last 'unmask' those dates back (deleting first D and replacing _ to /), and covert them to real dates with INPUT(). RETAIN statement is added just to put the new variable DATE at the second place right after SEDOl.
data transp;
retain SEDOL date;
set transp;
substr(_name_,1,1)='';
_name_=translate(_name_,"/","_");
date=input(strip(_name_),ddmmyy10.);
drop _name_;
format date ddmmyy10.;
run;