import csv file in SAS virtual machine - sas

I new in SAS and I have big data around 3000 rows and 10 columns in CSV file and I want to import this to SAS but I have MAC and I use SAS in virtual machine how can I import it?
I try to copy it but does not work.

3000 rows isn't big! I can't comment on the specifics of your VM and file access configuration, but one easy way is to simply copy paste your CSV values into SAS Studio and read them in using the datalines statement, eg:
/* set up a temp fileref to hold your csv */
filename tmp temp;
/* read in the raw data using datalines, and write to fileref */
data _null_;
infile datalines ;
file tmp ;
input;
put _infile_;
datalines;
col1,col2,col3,col4
your,data,goes,here
see,how,it,works?
;
run;
/* import the csv any way you like */
proc import datafile=tmp out=work.want dbms=csv replace;
getnames=yes;
run;
A more efficient option would be to build the dataset direct from the datalines - I'll leave it to you to decide which is more convenient, but here's a head start:
data work.want;
infile datalines delimiter=',';
input col1 $ col2 $ col3 $ col4 $;
datalines;
your,data,goes,here
see,how,it,works?
;
run;

Related

How to use filters when importing on sas

I have a very large data table on "dsv" format and i'm trying to import it on sas. However i don't have enough space to import the full table and then filter it (i've done this for smaller tables).
Is there any way to filter the data while importing it because at the end i will only use a part of that table ? If i want for example to import only rows that have the value 103 for Var2
PS: i'm using "proc import" not "data - infile..." because i don't know the exact number of columns
Var1
Var2
Var3
A10
103
Test
A02
102
Hiis
...
...
....
Thank you
You can add dataset options to the dataset listed in the OUT= option of PROC IMPORT.
Example:
filename dsv temp;
data _null_;
input (var1-var3) (:$20.);
file dsv dsd dlm='|';
put var1-var3;
cards;
Var1 Var2 Var3
A10 103 Test
A02 102 Hiis
;
proc import file=dsv dbms=csv out=want(where=(var2=102)) replace ;
delimter='|';
run;
The result is a dataset with just one observation.
NOTE: The data set WORK.WANT has 1 observations and 3 variables.
If you don't know the name of the second variable you could always just read the header row first and put the name into a macro variable.
data _null_;
infile dsv dsd dlm='|' truncover obs=1;
input (2*name) (:$32.);
call symputx('var2',nliteral(name));
run;
proc import file=dsv dbms=csv out=want(where=(&var2=102)) replace ;
delimter='|';
run;
You can add a where dataset option to the out= statement. For example:
proc import
file = 'myfile.txt'
out = want(where=(var2=103))
...;
run;

sas macro to read multiple multsheet excel in folder

this is my code to read multiple multisheet excel in sas its giving an error which i am attaching in last.i am only reading first sheet named summary of every excel in that particular folder
%macro sks2sas01(input=d:\excels,out=work.tt);
/* read files in directory */
%let dir=%str(%'dir %")&input.%str(\%" /A-D/B/ON%');
filename myfiles pipe %unquote(&dir);
data list1; length fname $256.;
infile myfiles truncover;
input myfiles $100.;
/* put _infile_;*/
fname=quote(upcase(cats("&input",'\',myfiles)));
out="&out";
drop myfiles;
call execute('
PROC IMPORT DBMS=EXCEL2002 OUT= _1
DATAFILE= '||fname||' REPLACE ;
sheet="summary";
RUN;
proc append data=_1 base='||out||' force; run;
proc delete data=_1; run;
');
run;
filename myfiles clear;
%mend sks2sas01;
%sks2sas01(input=c:\sasupload\excels,out=work.tt);
hereby i am attaching error i am getting:
GOPTIONS ACCESSIBLE;
%macro sks2sas01(input=d:\excels,out=work.tt);
/* read files in directory */
%let dir=%str(%'dir %")&input.%str(\%" /A-D/B/ON%');
filename myfiles pipe %unquote(&dir);
data list1; length fname $256.;
infile myfiles truncover;
input myfiles $100.;
/* put _infile_;*/
fname=quote(upcase(cats("&input",'\',myfiles)));
out="&out";
drop myfiles;
call execute('
PROC IMPORT DBMS=EXCEL2002 OUT= _1
DATAFILE= '||fname||' REPLACE ;
sheet="summary";
RUN;
proc append data=_1 base='||out||' force; run;
proc delete data=_1; run;
');
run;
filename myfiles clear;
%mend sks2sas01;
%sks2sas01(input=c:\sasupload\excels,out=work.tt);
ERROR: Insufficient authorization to access PIPE.
ERROR: Error in the FILENAME statement.
I had this exact same problem the other day. I had no authorization for the pipe command, and dir using sftp wasn't working for me either. This alternative solution worked great for me. In a nutshell, you're going to use some oldschool SAS directory commands to read every file within that directory, and save only the ones that end in .xlsx.
You can consider the name of your Excel files .-delimited, scan the filename of each one backwards, and look at just the first word to obtain only those Excel files. For example:
File name.xlsx
Backwards:
Delimiter
v
xlsx.name File
^^^^
First word
Step 1: Read all XLSX files in the directory, and create a dataset of them
filename dir 'C:\MyDirectory';
data XLSX_Files;
DirID = dopen("dir");
do i = 1 to dnum(DirID);
file_name = dread(DirID, i);
if(upcase(scan(file_name, 1, '.', 'b') ) = 'XLSX') then output;
end;
rc=dclose(DirID);
drop i rc DirID;
run;
Step 2: Read all of those names into a pipe-delimited macro variable
proc sql noprint;
select file_name, count(file_name)
into :XLSX_Files separated by '|',
:tot_files
from XLSX_Files;
quit;
Step 3: Import them all in a macro loop
%macro import_all;
%do i = 1 %to &tot_files;
proc import file="C:\MyDirectory\%scan(&XLSX_Files,&i,|)"
out=XLSX_&i
dbms=xlsx replace;
run;
%end;
%mend;
%import_all;
You can then stack or merge them as you need.
data Master_File;
set XLSX_1-XLSX_&tot_files;
run;
OR
data Master_File;
merge XLSX_1-XLSX_&tot_files;
by key;
run;

SAS Export data to create standard and comma-delimited raw data files

i m new to sas and studying different ways to do subject line task.
Here is two ways i knew at the moment
Method1: file statement in data step
*DATA _NULL_ / FILE / PUT ;
data _null_;
set engappeal;
file 'C:\Users\1502911\Desktop\exportdata.txt' dlm=',';
put id $ name $ semester scoreEng;
run;
Method2: Proc Export
proc export
data = engappeal
outfile = 'C:\Users\1502911\Desktop\exportdata2.txt'
dbms = dlm;
delimiter = ',';
run;
Question:
1, Is there any alternative way to export raw data files
2, Is it possible to export the header also using the data step method 1
You can also make use of ODS
ods listing file="C:\Users\1502911\Desktop\exportdata3.txt";
proc print data=engappeal noobs;
run;
ods listing close;
You need to use the DSD option on the FILE statement to make sure that delimiters are properly quoted and missing values are not represented by spaces. Make sure you set your record length long enough, including delimiters and inserted quotes. Don't worry about setting it too long as the lines are variable length.
You can use CALL VNEXT to find and output the names. The LINK statement is so the loop is later in the data step to prevent __NAME__ from being included in the (_ALL_) variable list.
data _null_;
set sashelp.class ;
file 'class.csv' dsd dlm=',' lrecl=1000000 ;
if _n_ eq 1 then link names;
put (_all_) (:);
return;
names:
length __name__ $32;
do while(1);
call vnext(__name__);
if upcase(__name__) eq '__NAME__' then leave;
put __name__ #;
end;
put;
return;
run;

Importing in SAS using infile

filename Source 'C:\Source.txt';
Data Example;
Infile Source;
Input Var1 Var2;
Run;
Is there a way I can import all the variables from Source.txt without the "Input Var1 Var2" line? If there are many variables, I think it's too time consuming to list out all the variables, so I was wondering if there's any way to bypass that.
Thanks
Maybe you can use proc import ?
For a CSV I use this and I don't have to define every variable
proc import datafile="&CSVFILE"
out=myCsvData
dbms=dlm
replace;
delimiter=';';
getnames=yes;
run;
It depends on what you have in your txt file. Try different delimiters.
If you are looking at a solution which is INFILE statement based then following reference code should help.
data _null_;
set sashelp.class;
file '/tester/sashelp_class.txt' dsd dlm='09'x;
put name age sex weight height;
run;
/* Version #1 : When data has mixed data(numeric and character) */
data reading_data_w_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
format name $10. age 8. gender $1. weight height 8.2;
input (name--height) (:);
run;
proc print data=reading_data_w_format;run;
proc contents data=reading_data_w_format;run;
/* Version #2 : When all data can be read a character.
I know this version doesn't make sense, but it's still an option*/
data reading_data_wo_format;
infile '/tester/sashelp_class.txt' dsd dlm='09'x;
input (var1-var5) (:$8.); /* Length would be max length of value in all the columns */
run;
proc print data=reading_data_wo_format;run;
proc contents data=reading_data_wo_format;run;
I'd suggest to write down the informat for the variables to be read so that you are sure that the file is as per your specification. PROC IMPORT will try to scan the data first from 1st row till GUESSINGROWS(do not set it to high, if each column is of consistent length) value and based on the length and type, it will use an informat and length which it finds suitable for the reading the variables in the file.

Need to repeate data reading from multiple files using sas and run freqs on separate dataset created from separate files

I am new to SAS and facing few difficulties while creating following program.
My requirement is to pass the filename generated dynamically and read it so that don't have to write code five times to read data from 5 different files and then run freqs on the datasets.
I have provided the code below and have to write this code for more than 50 files:
Code
filename inp1 '/chshttp/prod/clients/coms/raw/coms_coms_relg_f1102_t1102_c10216_vEL5535.raw';
filename inp2 '/chshttp/prod/clients/coms/raw/coms_coms_relg_f1103_t1103_c10317_vEL8312.raw';
filename inp3 '/chshttp/prod/clients/coms/raw/coms_coms_relg_f1104_t1104_c10420_vEL11614.raw';
filename inp4 '/chshttp/prod/clients/coms/raw/coms_coms_relg_f1105_t1105_c10510_vEL13913.raw';
filename inp5 '/chshttp/prod/clients/coms/raw/coms_coms_relg_f1106_t1106_c10628_vEL17663.raw';
data test;
Do i = 1 to 5;
infile_name = 'inp' || i;
infile infile_name recfm = v lrecl=1800 end=eof truncover;
INPUT
#1 E_CUSTDEF1_CLIENT_ID $CHAR5.
#1235 E_MED_PLAN_CODE $CHAR20.
#1090 MED_INS_ELIG_COVERAGE_IND $CHAR20.
#1064 MED_COVERAGE_BEGIN_DATE $CHAR8.
#1072 MED_COVERAGE_TERM_DATE $CHAR8.
;
if E_CUSTDEF1_CLIENT_ID ='00002' then
output test;
end;
run;
proc freq data = test;
tables E_CUSTDEF1_CLIENT_ID*E_MED_PLAN_CODE / list missing;
run;
Please help!!
Here's an example you can adapt. There are different ways to do this, but this is one- depending no how you want the frequencies.
Step 1: Create a dataset, 'my_filenames', that stores the filename you want to read in, one per line, in a variable FILE_NAME.
Step 2: Read in the files.
data my_data;
set my_filenames;
infile a filevar=file_name <the rest of your options>;
<your input statement>;
run;
proc freq data=mydata;
by file_name;
<your table statements>;
run;
This is simple, data driven code that doesn't require macros or storing large amounts of data in things that shouldn't have data in them (macro variables, filenames, etc.)
To directly answer your question, here is a SAS macro to read each file and run PROC FREQ:
%macro freqme(dsn);
data test;
infile "&dsn" recfm = v lrecl=1800 end=eof truncover;
INPUT #1 E_CUSTDEF1_CLIENT_ID $CHAR5.
#1235 E_MED_PLAN_CODE $CHAR20.
#1090 MED_INS_ELIG_COVERAGE_IND $CHAR20.
#1064 MED_COVERAGE_BEGIN_DATE $CHAR8.
#1072 MED_COVERAGE_TERM_DATE $CHAR8.
;
if E_CUSTDEF1_CLIENT_ID = '00002';
run;
proc freq data=test;
tables E_CUSTDEF1_CLIENT_ID*E_MED_PLAN_CODE / list missing;
run;
proc delete data=test;
run;
%mend;
%freqme(/chshttp/prod/clients/coms/raw/coms_coms_relg_f1102_t1102_c10216_vEL5535.raw);
%freqme(/chshttp/prod/clients/coms/raw/coms_coms_relg_f1103_t1103_c10317_vEL8312.raw);
%freqme(/chshttp/prod/clients/coms/raw/coms_coms_relg_f1104_t1104_c10420_vEL11614.raw);
%freqme(/chshttp/prod/clients/coms/raw/coms_coms_relg_f1105_t1105_c10510_vEL13913.raw);
%freqme(/chshttp/prod/clients/coms/raw/coms_coms_relg_f1106_t1106_c10628_vEL17663.raw);
Note that I added a PROC DELETE step to delete the SAS data set after creating the report. I did that more for illustration, since you don't say you need the file as a SAS data set for subsequent processing.
You can use this as a template for other macro programming.