Complete columns based on values that precede data as table format in SAS - sas

How should the code be completed to make this work?
Code:
data ms;
infile 'C';
input cr ls ms color $;
if input #; *statemet that reads the line with one word and complete the color column*
run;
Input:
Blars
10 83287 10.00
20 1748956 30.00
30 2222222 73.00
40 833709 90.00
Klirs
10 922222 90.50
20 1222222 10.00
30 1111111 93.33
40 8998877 300.90
Expected output:
cr
ls
ms
color
10
83287
10.00
Blars
20
1748956
30.00
Blars
30
2222222
50.00
Blars
40
833709
73.00
Blars
10
922222
90.50
Klirs
20
1222222
10.00
Klirs
30
1111111
93.33
Klirs
40
8998877
300.90
Klirs
Attempted to read it

Just RETAIN the extra variable. You need some way to detect which type of line you currently are reading. When it has the COLOR just update the COLOR variable and do not write out an observation. When it has the actual data then read all of the fields and write an observation.
data ms;
infile 'C' truncover ;
length color $10 cr ls ms 8;
retain color;
input cr ?? # ;
if missing(cr) then do;
color = _infile_;
delete;
end;
input ls ms ;
run;
Make sure to define the COLOR column long enough to store the longest value. This assumes there are no blank lines, as you mentioned in your comment on the original question.

Slightly different method than other solution.
Use INPUT ## to read the full line and hold it in the automatic variable _infile_.
Check _infile_ variable to see if it contains any numeric values, if so, process as data.
Otherwise, process as a colour.
data have;
infile cards truncover;
*set length and retain color across rows;
length color $10 cr ls ms 8;
retain color;
*read in string;
input ##;
*check for any digits in string, if any are found, process as data;
if anydigit(_infile_) then do;
input cr ls ms;
output;
end;
*otherwise read in as color;
else input color $;
cards;
Blars
10 83287 10.00
20 1748956 30.00
30 2222222 73.00
40 833709 90.00
Klirs
10 922222 90.50
20 1222222 10.00
30 1111111 93.33
40 8998877 300.90
;;;;
run;

Richard, your code could even be more succinct.
* attempt to read first 2 chars as number;
* ?? suppresses errors;
input num ?? 2. #;
if missing(num) then
input #1 color $;
else do;
input #1 cr ls ms;
output;
end;

You can scan a held generic input line and then choose which input statement you want based on the scan.
data want;
length color $20 cr ls ms 8;
retain color;
infile 'c' missover;
input #;
if missing(input(scan(_infile_,1),??best12.)) then
input #1 color ;
else
input #1 cr ls ms ;
if not missing(cr);
run;

Related

compute variable after datalines

I have the following dataset (fictional data).
DATA test;
INPUT name $ age height weight;
DATALINES;
Peter 20 1.70 80
Hans 30 1.72 75
Tina 25 1.67 65
Luisa 10 1.20 50
;
RUN;
How can I compute a new variable "bmi" (weight / height^2) directly after the end of the DATALINE-command? Unfortunately in my SAS-book all the examples are with DATA ... INFILE= instead of using DATALINES.
PROC PRINT
DATA = test;
TITLE 'Fictional Data';
RUN;
Datalines appears at the end of the data step. Your computation statements should be placed before datalines, after the input
INPUT name $ age height weight;
bmi = weight / height**2;
DATALINES;
…

adding space to character variables

when I run the following code, I see that the number in my character variable gets shifted as a value
data test;
input names$ score1 score2;
cards;
A1 80 95
A 2 80 95
;
run;
proc print data=test;
run;
leading to a output like the following
The SAS System
Obs names score1 score2
1 A1 80 95
2 A 2 80
How do I create a variable like "A 2" with space so that the 2 doesn't get shifted
Your problem is you're using space delimited data input. Is it truly space delimited, though, or is it columnar (fixed position)?
data test;
input names $ 1-4 score1 5-12 score2 13-20;
cards;
A1 80 95
A 2 80 95
;
run;
If it's truly delimited and you're just not exactly replicating the data here, you have a few choices. You can use the & character to ask SAS to look for two consecutive spaces to be a delimiter, but your actual data doesn't have that correctly either - but it would look like so:
data test;
input names &$ score1 score2;
cards;
A1 80 95
A 2 80 95
;
run;
Or if you truly have the issue here that you have some single spaces that are delimiters and some single spaces that are not, you'll have to work out some sort of logic to do this. The exact logic depends on your rules, but here's an example - here I look for that space, and assume that if it is there then there is exactly one more character, then I want to move everything down one so that I have a guaranteed double space now. This is probably not a good rule for you, but it is an example of what you might do.
data test;
input #;
if substr(_infile_,2,1)=' ' then do; *if there is a space at spot two specifically;
_infile_ = substr(_infile_,1,3)||' '||substr(_infile_,4); *shift everything after 3 down;
end;
input names &$ score1 score2;
cards;
A1 80 95
A 2 80 95
;
run;
If your input is fixed block, as suggested, and the NAMES field is 12 bytes, as suggested by the data, then you can use formatted input for NAMES.
data test;
length names $ 12 score1 score2 8;
input names $12. score1 score2;
names=trim(left(names));
cards;
A1 80 95
A 2 80 95
;
run;

usage of missover statement in SAS

I have below dataset
data ab;
infile cards missover;
input m p c;
cards;
1,2,3
4,5,
6,7,
run;
The output of this query is
m p c
. . .
. . .
. . .
Why did i get the below output instead of error?
I havent specified any delimiter also.
Please explain.
Thanks in Advance,
Nikhila
You do get INVALID DATA messages. SAS is defaulting to space delimited fields, you need to specify the DSD INFILE statement option and or DLM=','. You don't actually need MISSOVER as you have the proper number of delimiters for three comma delimited fields, but I would probably go ahead and keep it.
24 data ab;
25 infile cards missover;
26 input m p c;
27 cards;
NOTE: Invalid data for m in line 28 1-5.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
28 1,2,3
m=. p=. c=. _ERROR_=1 _N_=1
NOTE: Invalid data for m in line 29 1-4.
29 4,5,
m=. p=. c=. _ERROR_=1 _N_=2
NOTE: Invalid data for m in line 30 1-4.
30 6,7,
m=. p=. c=. _ERROR_=1 _N_=3
NOTE: The data set WORK.AB has 3 observations and 3 variables.
24 data ab;
25 infile cards dsd missover;
26 input m p c;
27 cards;
NOTE: The data set WORK.AB has 3 observations and 3 variables.
The MISSOVER is what makes you have three observations instead just one. Without the MISSOVER then SAS will try to read each line as one value and you will end up with one observation of all missing values. It is easier to see if you change your variables to character instead of numeric since you can see where the values end up.
data ab;
infile cards missover;
input m $ p $ c $;
put (m p c) (=);
cards;
1,2,3
4,5,
6,7,
;
m=1,2,3 p= c=
m=4,5, p= c=
m=6,7, p= c=
data ab;
infile cards /*missover */;
input m $ p $ c $;
put (m p c) (=);
cards;
1,2,3
4,5,
6,7,
;
m=1,2,3 p=4,5, c=6,7,
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.

Transform numbers with 0 values at the beginning

I have the following dataset:
DATA survey;
INPUT zip_code number;
DATALINES;
1212 12
1213 23
1214 23
;
PROC PRINT; RUN;
I want to link this data to another table but the thing is that the numbers in the other table are stored in the following format: 0012, 0023, 0023.
So I am looking for a way to do the following:
Check how long the number is
If length = 1, add 3 0 values to the beginning
If length = 2, add 2 0 values to the beginning
Any thoughts on how I can get this working?
Numbers are numbers so if the other table has the field as a number then you don't need to do anything. 13 = 0013 = 13.00 = ....
If the other table actually has a character variable then you need to convert one or the other.
char_number = put(number, Z4.);
number = input(char_number, 4.);
You can use z#. formats to accomplish this:
DATA survey;
INPUT zip_code number;
DATALINES;
1212 12
1213 23
1214 23
9999 999
8888 8
;
data survey2;
set survey;
number_long = put(number, z4.);
run;
If you need it to be four characters long, then you could do it like this:
want = put(input(number,best32.),z4.);

Beginner. Reading data in SAS (Reading date and 100 score issue)

The problem said: The first line is a header line and should not be read (use the infile option firstobs=2) The remaining lines contain and ID number(character). gender(character), date of birth DOB, and two scores 1 and 2. Note that there are some missing values for the scores, and you want to be sure that SAS does not go to a new line to read these values. Write a SAS DATA STEP TO READ DOB with DATE9. Here are the lines of data(I put it in my code to save space).
DATA READ;
INFILE DATALINES FIRSTOBS=2;
INPUT ID 1-3
GENDER $ 5
#7 DOB mmddyy10.
# SCORE1 3
# SCORE2 3
;
DATALINES;
***Header line: ID GENDER DOB SCORE1 SCORE2
001 M 10/10/1976 1OO 99
002 F 01/01/1960 89
003 M 05/07/2001 90 98
;
DATA PROB12_8;
SET READ;
FORMAT DOB MMDDYY9.;
RUN;
PROC PRINT DATA=PROB12_8;
RUN;
My output is:
OBS ID GENDER DOB SCORE1 SCORE2
1 1 M . . 99
2 2 F . 89 .
3 3 M . 90 98
I don't understard why the program read in that way, if I specify the amount of spaces and use the pointer in my program.
Thanks for your help.
Your problems start at SCORE1 and SCORE2 you have the pointer control specified incorrectly. Also notice that 1OO is not 100. This file can be read easily with list input and missover infile statement option.
DATA READ;
INFILE DATALINES FIRSTOBS=2 missover;
informat id $3. gender $1. dob mmddyy10.;
input ID GENDER DOB SCORE1 SCORE2;
format dob mmddyy10.;
datalines;
***Header line: ID GENDER DOB SCORE1 SCORE2
001 M 10/10/1976 1OO 99
002 F 01/01/1960 89
003 M 05/07/2001 90 98
;;;;
run;