Problem Statement: I have a text file and I want to read it using SAS INFILE function. But SAS is not giving me the proper output.
Text File:
Akash 18 19 20
Tejas 20 16
Shashank 16 20
Meera 18 20
The Code that I have tried:
DATA Arr;
INFILE "/folders/myfolders/personal/SAS_Array .txt" missover;
INPUT Name$ SAS DS R;
RUN;
PROC PRINT DATA=arr;
RUN;
While the result i got is :
Table of Contents
Obs Name SAS DS R
1 Akash 18 19 20
2 Tejas 20 16 .
3 Shashank16 20 .
4 Meera 18 20 .
Which is improper. So what is wrong with the code? I need to read the file in SAS with the same sequence of marks as in text file. Please help.
Expected result:
Table of Contents
Obs Name SAS DS R
1 Akash 18 19 20
2 Tejas . 20 16
3 Shashank16 20 .
4 Meera 18 . 20
Thanks in advance.
If that text file is tab-delimited, you should specify the delimiter in the infile statement and use the dsd option to account for missing values:
DATA Arr;
INFILE "/folders/myfolders/personal/SAS_Array .txt" missover dlm='09'x dsd;
INPUT Name $ SAS DS R;
RUN;
PROC PRINT DATA=arr;
RUN;
EDIT: after editing, your sample text file now looks fixed-width rather than space-delimited. In that case you should be using column input:
DATA Arr;
INFILE "/folders/myfolders/personal/SAS_Array .txt" missover;
INPUT Name $1-9 SAS 10-12 DS 13-15 R 16-18;
RUN;
example with datalines:
DATA Arr;
INFILE datalines missover;
INPUT Name $1-9 SAS 10-12 DS 13-15 R 16-18;
datalines;
Akash 18 19 20
Tejas 20 16
Shashank 16 20
Meera 18 20
RUN;
Related
I am pretty new to SAS, I am trying to see which songs/artists/albums have appeared most on my spotify most played csv's (2017-2020). I am getting stuck very early on trying to just set the 2017 csv as a data set. Is there anything anyone can see that I am doing wrong? Seems like this step should be pretty straight forward.
data Spotify_2017;
infile='C:\Users\your_top_songs_2017.csv' dlm=’09’x dsd firstobs=2;
input Track URI Track Name Artist URI Artist Name Album URI Album Name Album Release Date Disc Number Track Number Track Duration Explicit Popularity Added By Added At;
run;
and here is the log:
1 The SAS System 10:18 Friday, January 15, 2021
1 ;*';*";*/;quit;run;
2 OPTIONS PAGENO=MIN;
3 %LET _CLIENTTASKLABEL='Spotify.sas';
4 %LET _CLIENTPROCESSFLOWNAME='Standalone Not In Project';
5 %LET _CLIENTPROJECTPATH='';
6 %LET _CLIENTPROJECTPATHHOST='';
7 %LET _CLIENTPROJECTNAME='';
8 %LET _SASPROGRAMFILE='C:\Users\xxx\Desktop\Spotify\Spotify.sas';
9 %LET _SASPROGRAMFILEHOST='USRDUL-PC0NNXU1';
10
11 ODS _ALL_ CLOSE;
12 OPTIONS DEV=SVG;
13 GOPTIONS XPIXELS=0 YPIXELS=0;
14 %macro HTML5AccessibleGraphSupported;
15 %if %_SAS_VERCOMP_FV(9,4,4, 0,0,0) >= 0 %then ACCESSIBLE_GRAPH;
16 %mend;
17 FILENAME EGHTML TEMP;
18 ODS HTML5(ID=EGHTML) FILE=EGHTML
19 OPTIONS(BITMAP_MODE='INLINE')
20 %HTML5AccessibleGraphSupported
21 ENCODING='utf-8'
22 STYLE=HtmlBlue
23 NOGTITLE
24 NOGFOOTNOTE
25 GPATH=&sasworklocation
26 ;
NOTE: Writing HTML5(EGHTML) Body file: EGHTML
27
28 data Spotify_2017;
29 infile='C:\Users\pcardella\Desktop\Spotify\C:\Users\xxx\Desktop\Spotify\your_top_songs_2017.csv' dlm=’09’x dsd
___
388
76
29 ! firstobs=2;
ERROR 388-185: Expecting an arithmetic operator.
ERROR 76-322: Syntax error, statement will be ignored.
30 input Track URI Track Name Artist URI Artist Name Album URI Album Name Album Release Date Disc Number Track Number Track
30 ! Duration Explicit Popularity Added By Added At;
31 run;
ERROR: No DATALINES or INFILE statement.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.SPOTIFY_2017 may be incomplete. When this step was stopped there were 0 observations and 16 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
32
33 %LET _CLIENTTASKLABEL=;
34 %LET _CLIENTPROCESSFLOWNAME=;
35 %LET _CLIENTPROJECTPATH=;
36 %LET _CLIENTPROJECTPATHHOST=;
37 %LET _CLIENTPROJECTNAME=;
38 %LET _SASPROGRAMFILE=;
39 %LET _SASPROGRAMFILEHOST=;
2 The SAS System 10:18 Friday, January 15, 2021
40
41 ;*';*";*/;quit;run;
42 ODS _ALL_ CLOSE;
43
44
45 QUIT; RUN;
46
infile is a statement and does not need an equals sign. The syntax is:
infile 'file location here' <options>;
data Spotify_2017;
infile 'C:\Users\your_top_songs_2017.csv' dlm=’09’x dsd firstobs=2;
input Track URI Track Name Artist URI Artist Name Album URI Album Name Album Release Date Disc Number Track Number Track Duration Explicit Popularity Added By Added At;
run;
One way to help learn importing raw files using the data step is to use proc import. proc import will import the data and generate data step code for you in the log when importing csv files. You can study it to see how it works and try to replicate it.
proc import
file = 'C:\Users\your_top_songs_2017.csv'
out = spotify_2017
dbms = csv
replace;
run;
Also, a great option to help make logs more readable in Enterprise Guide is to disable autogenerated code. Go to Tools -> Options -> Results -> General -> uncheck "Show generated wrapper code in SAS log"
This is my code:
DATA sales;
INFILE 'D:\Users\...\Desktop\Onions.dat';
INPUT VisitingTeam $ 1-20 ConcessionSales 21-24 BleacherSales 25-28
OurHits 29-31 TheirHits 32-34 OurRuns 35-37 TheirRuns 38-40;
PROC PRINT DATA = sales;
TITLE 'SAS Data Set Sales';
RUN;
This is the data, but the spacing may be incorrect.
Columbia Peaches 35 67 1 10 2 1
Plains Peanuts 210 . 2 5 0 2
Gilroy Garlics 151035 12 11 7 6
Sacramento Tomatoes 124 85 15 4 9 1
;
I need to add or delete a blank column at the 19th
column. Can someone help?
Just open the dataset and then look at what the variable name is. Then do:
Data Want (drop=varible_name_you_are_dropping); /*This is your output dataset*/
Set have; /*this is your dataset you have*/
Run;
say I have two rows of data I try to read in.
cody: 10 9 20 18
john: 4 5 1 2
and I want to read them in a two row style in datalines, like such:
input cody john ##;
datalines;
10 9 20 18
4 5 1 2
run;
But this reads it in like cody: 10 20 4 1 john: 9 18 5 2
How do I fix this?
You'd need to read in the CODY lines all at once, then the JOHN lines all at once. It's unclear what the final data structure should look like, but this is one possibility, and then you can restructure this how you wish, perhaps with PROC TRANSPOSE.
Basically, I assign name to the proper name (using an array here, but you can do this in better ways, data-driven ways, depending on your data). Then I loop and tell SAS to keep reading in data until it is unable to read any more, using the truncover option (or missover is also fine) to make sure it doesn't skip to the next line, and output a new row for each value.
data want;
array names[2] $ _temporary_ ("Cody","John") ;
infile datalines truncover;
do _name = 1 to 2;
name = names[_name];
do _i = 1 by 1 until (missing(value));
input value #;
if not missing(value) then output;
end;
input;
end;
drop _:;
datalines;
10 9 20 18
4 5 1 2
run;
I think that the solution to your problem is to use the names as another column, not as variables, like this:
data foo;
input var1 $ var2 var3 var4 var5;
datalines;
cody 10 9 20 18
john 4 5 1 2
;
run;
What's the code program in SAS to stack data?
For the purpose of example, lets say I have this dataset:
DATA test.one;
INPUT Name $ Y1996 Y1997 Y1998 Y1999;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
run;
proc print data = test.one;
run;
Running this set would give me an output like this:
Name Y1996 Y1997 Y1998 Y1999
Dan 5 10 40 20
Derek 10 12 10 10
However, I would want my data to look like this:
Name Year Income
Dan 1996 5
Dan 1997 10
Dan 1998 40
Dan 1999 20
Derek 1996 10
Derek 1997 12
Derek 1998 10
Derek 1999 10
It would create a new variable income corresponding to the stacking the of the data as shown above.
Are you asking how to read the raw data directly into that form?
DATA want;
INPUT Name $ #;
do year=1996 to 1999;
input income #;
output;
end;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
;
The PROC Transpose can solve this;
DATA test.one;
INPUT Name $ y1996 y1997 y1998 y1999;
cards;
Dan 5 10 40 20
Derek 10 12 10 10
run;
proc print data = test.one;
run;
proc transpose data=test.one out=long1;
by name;
run;
data test2;
set long1 (rename=(col1=Income));
RUN;
It will then transform the dataset into a stacked version.
I'm trying to read in some raw data using datalines...
data Exp_data;
INPUT a: 2. b: 2. DATE1: MMDDYY10. DATE2: MMDDYY10.;
FORMAT DATE1 DATE9. DATE2 DATE9.;
datalines;
27 93 03/16/2008 03/17/2008
27 93 03/17/2009 03/19/2009
68 68
55 55
46 68
34 34
45 67
56 75
34 34
34 34
;RUN;
But this code is reading data until 6 th row. I couldn't figure out where I'm doing mistake.
Thanks in advance!
Add this line before your input statement.
infile datalines missover;
As of the third row you don't have 4 values, so SAS needs to know what to do with the missing values. Missover tells sas to set the remaining values to missing.