Not able to search for the # parameter in the search engines or the syntax.
What does the # operator do in SAS? I am not able to find any references to it in the operators list as well: https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780367.htm
data dsfa.asdfa;
infile asdfsadf;
input #1 sdfs $12. #14 sdlfsda $10 sdlfsd $1.;
run;
It's not an operator, it's part of the INPUT statement and documented within the INPUT statement. There are trailing # or ## and then ones that come before variable names. In that case, it's telling SAS which column to read data from.
#n moves the pointer to column n.
Here's the link to the docs
In format input method we can read data values from raw data files by using 3 methods
1)Column Skip pointer(+n)
2)Column hold pointer(#n)
3)Column range(n.)
So in the below code:
input #1 sdfs $12. #14 sdlfsda $10 sdlfsd $1.;
#1 means you are instructing SAS to read from first column upto 12 characters length
#14 means you are instructing SAS to read from 14th column upto 10 characters length
Note: #n means column number to start reading
Related
I have two lines of observations to read in SAS.
It is a comma-delimited data set.
My code is as below:
DATA SASweek1.industry;
INFILE "&Dirdata.Assignment1_Q6_data.txt" DLM="," DSD termstr=crlf TRUNCOVER;
LENGTH Company $ 15;
INPUT Company $ State $ Expense COMMA9. ;
FORMAT Expense DOLLAR9.;
*INFORMAT Expense DOLLAR10.;
RUN; * not ready;
The raw data set looks like this:
I can print out the first line of observations well,
but the last "0" will go to the first position of the second
line, becoming "0Lee's..".
Any suggestions would be highly appreciated!!
It is just doing what you told it to do. You told it to read exactly 9 characters.
Normally you should not use formatted input mode with delimited data. You prevent that by either adding the : (colon) prefix in front of the informat specification in the INPUT statement or removing the informat specification completely and using an INFORMAT statement to let SAS know what informat to use.
But your data is NOT properly delimited because the last field contains the delimiter, but the value is not enclosed in quotes. So the commas make it look like two values instead of one. The real solution is to fix the process that created the file to create a valid delimited file. It needs to quote the values with commas in them, or remove the commas from the numbers, or use a delimiter character that does not appear in the data.
Fortunately since it is the last field on the line you CAN use formatted input to read just that field. Since you are using the TRUNCOVER option just set the width of the informat in the INPUT statement to the maximum.
DATA SASweek1.industry;
INFILE "&Dirdata.Assignment1_Q6_data.txt" DLM="," DSD termstr=crlf TRUNCOVER;
LENGTH Company $15 State $15 Expense 8;
INPUT Company State Expense COMMA32. ;
FORMAT Expense DOLLAR9.;
RUN;
I can not find the way to reverse text strings.
For example I want to reverse these:
MMMM121231M34 to become 43M132121MMMM
MM1M11M1 to become 1M11M1MM
1111213111 to become 1113121111
Judging from your examples, what you mean by 'rearrange' is actually 'reverse'.
In that case, you've got the very handy reverse() function in SAS.
Used in context:
data test;
length text $32;
infile datalines;
input text $;
result=reverse(strip(text));
datalines;
MMMM121231M34
MM1M11M1
1111213111
;
run;
EDIT on #Joe's request: in the particular example above, I create the test dataset by setting a length of 32 characters for the text variable. Therefore, when reading the values from datalines, these are padded with blanks up to that total of 32 characters. Hence, when reversing that value, the result has that many blanks at the start, followed by the actual value you are looking for. By adding the strip function, you remove the excess blanks from the value of text before reversing, keeping only the "real" value in the result.
I'm familiar with the :, and ~ modifiers in SAS put and input statements. The behaviour of & in an input statement is also fairly well documented. But what does & do in a put statement?
It seems to have a similar effect to :, triggering modified list output rather than formatted output, but I can't find any documentation of this behaviour.
E.g.
data _null_;
set sashelp.class;
file 'c:\temp\output.csv' dlm=',';
put Name Sex Age & 4. Height Weight;
run;
Quoting from the on-line documentation in the section of SAS 9.4 under INPUT Statement, List
&
indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first.
Restriction:
The & modifier must follow the variable name and $ sign that it affects.
Tip:
If you specify an informat after the & modifier, the terminating condition for the format modifier remains two blanks.
Here is an example from the example section:
Example Reading Character Data That Contains Embedded Blanks
The INPUT statement in this DATA step uses the & format modifier with list input to read character values that contain embedded blanks.
data list;
infile file-specification;
input name $ & score;
run;
It can read these input data records:
----+----1----+----2----+----3----+
Joseph 11 Joergensen red
Mitchel 13 Mc Allister blue
Su Ellen 14 Fischer-Simon green
The & modifier follows the variable that it affects in the INPUT statement. Because this format modifier follows NAME, at least two blanks must separate the NAME field from the SCORE field in the input data records.
You can also specify an informat with a format modifier, as shown here:
input name $ & +3 lastname & $15. team $;
In addition, this INPUT statement reads the same data to demonstrate that you are not required to read all the values in an input record. The +3 column pointer control moves the pointer past the score value in order to read the value for LASTNAME and TEAM.
In the following code
data temp2;
input id 1 #3 date mmddyy11.;
cards;
1 11/12/1980
2 10/20/1996
3 12/21/1999
;
run;
what do 1 #3 symbols mean ? i presume 1 means that id is the first character in the data . I know that #3 means that date variable starts with the third character , but why is it in front of date whereas 1 is after id?
Because that's a badly written input statement. You can specify input in a number of ways, and that mixes a few different ways to do things which happen to be allowed to mix (mostly). Read the SAS documentation on input for more information.
Some common styles that you can use:
input #1 id $5.; *Formatted input. Allows specification of start position and informat, more useful if using date or other informat that is not just normal character/number.;
input id str $ otherstr $ date :date9.; *List input. This is for delimited text (like a CSV), still lets you specify informat.
input #'ID:' id $5.; *A special case of formatted input. allows you to parse files that include the variable name, useful for old style files and some xml/json/etc. type files;
input x 1-10 y 11-20; *Column input. Not used very commonly as it's less flexible than start/informat style.;
There are other options (such as named input) that are not very frequently used in my experience.
In your specific example, the first variable is read in with column input [id 1 says 'read a 1 character numeric from position 1 into id'] and then the second variable is read with formatted input [#3 date mmddyy11. says 'Read an 11 character date variable from position 3[-13] into a numeric using the date informat to translate it to a number.'] It also says someone gave you that code who isn't very familiar with SAS, since mmddyy10. is the correct informat - the 11th character cannot be helpful.
I am reading a period '.' as a character variable's value but it is reading it as a blank value.
data output1;
input #1 a $1. #2 b $1. #3 c $1.;
datalines;
!..
1.3
;
run;
Output Required
------ --------
A B C A B C
! ! . .
1 3 1 . 3
Please help me in reading a period as such.
The output is determined by the informat used ($w. informat in your case, requested by $1. in your code, so $1. is first of all informat definition, lenght definition of variable is a side product of this).
Use $char. informat for desired result.
data output1;
input #1 a $char1. #2 b $char1. #3 c $char1.;
datalines;
!..
1.3
;
run;
From documentation:
$w Informat
The $w. informat trims leading blanks and left aligns the values before storing the text. In addition, if a field contains only blanks and a single period, $w. converts the period to a blank because it interprets the period as a missing value. The $w. informat treats two or more periods in a field as character data.
$CHARw. informat
The $CHARw. informat does not trim leading and trailing blanks or convert a single period in the input data field to a blank before storing values.
I don't immediately see why it does not work.
But if you are not interested in figuring out why it does not work, but just want something that does: read it in as 1 variable of length $3. Then in a next step; split it using substr.
E.g.,
data output1;
length tmp $3;
input tmp;
datalines;
!..
1.3
;
run;
data output2 (drop=tmp);
length a $1;
length b $1;
length c $1;
set output1;
a=substr(tmp,1,1);
b=substr(tmp,2,1);
c=substr(tmp,3,1);
run;