How to use SYSFUNCT and PUTC or PUTN - sas

In a macro I have 2 variables, A and B. A is a cycle variable and is an integer from 1 to 12. B needs to be 01 when A is 1, 02 when A is 2, etc, 10 when A is 10, 11 when A is 11 and 12 when A is 12. Basically B needs to be 2 digits, possibly with leading zeros. In a datastep this is easy:
B=PUT(A,z2.);
But inside a macro this code will not work and SYSFUNC will not work with PUT function. So how can the job be done?

putn works fine. putc is for incoming char arguments ($ formats), putn is for incoming numeric arguments like this one.
%let a=5;
%let b=%sysfunc(putn(&a,z2.));
%put &=a. &=b.;

Related

Expecting an arithmetic operator in SAS

I get an error with this SAS code and I'm not sure what is wrong with it.
Data june;
infile "/home/u/Work/NHS/20200630-RTT-JUNE.csv" DLM=',';
'Gt 00 To 01 Weeks SUM 1'n = input('Gt 00 To 01 Weeks SUM 1'n);
run;
ERROR 388-185: Expecting an arithmetic operator.
ERROR 76-322: Syntax error, statement will be ignored.
Thank you very much.
You did not provide the INPUT() function call with any informat specification to use. The INPUT() function requires two arguments, the text string to convert and the informat to use to do the conversion.
Also your code will not do anything since you are not reading any data from the file you pointed to with the INFILE statement.
You probably need to use the INPUT statement instead. For example code like this:
data june;
infile "/home/u59602916/Work/NHS/20200630-RTT-JUNE-2020-full-extract-revised.csv" DLM=',';
input 'Gt 00 To 01 Weeks SUM 1'n;
run;
Will attempt to one number from each line of the source text file. It will read the characters up to the first comma on the line.
If you really have a CSV file you should use the DSD option and not just the DLM= option. That will properly handle values that might be enclosed in quotes because they contain the delimiter. It will also properly treat adjacent delimiters as indicating an empty value.
Use the put function if you want to convert numeric values to character. Also, you'll need to use the input statement to indicate the variable(s) you would like to read in from the external file.
Data june;
infile "/home/u59602916/Work/NHS/20200630-RTT-JUNE-2020-full-extract-revised.csv" DLM=',';
input 'Gt 00 To 01 Weeks SUM 1'n $;
'Gt 00 To 01 Weeks SUM 1'n = put('Gt 00 To 01 Weeks SUM 1'n, 4.);
run;
If you want to read in a numeric value, don't use a function, simply use the input statement as above (without the dollar sign to indicate numeric).
Data june;
infile "/home/u59602916/Work/NHS/20200630-RTT-JUNE-2020-full-extract-revised.csv" DLM=',';
input 'Gt 00 To 01 Weeks SUM 1'n ;
run;
http://support.sas.com/kb/24/590.html
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.2/lestmtsref/n1rill4udj0tfun1fvce3j401plo.htm#p1sic7f8w8x1wpn1i07zx5vhwdwj

Substring the last digits of a numeric variable in SAS

I have a numeric variable in SAS and I am struggling to extract the last digits of it. I tried using substr but it only handles char variables. The variable I have sometimes has 3 or 4 digits.
Example
1234
237
754
9000
In these cases I need to extract
34
37
54
00
And store them as a new numeric variable. I tried the code bellow in a proc sql statement but it returns and error. Can someone help me?
Var2 = input(substr(put(var1), 1, length(put(var1))-1), 8.)
SUBSTRN() works with numeric variables but it doesn't work well in this case because there's no easy way to specify the last two characters only. The MOD() function works well in this case, because you're essentially finding the remainder of 100. Since it looks like it's a character you want, you need to use PUT() to convert it to a character as well, with the Z2 format to keep the 0 and leading zeroes.
want = put(mod(value, 100), z2.);
one more way is to use prxchange as shown below
data have;
input val1;
val2= input(prxchange('s/^(.*)(\d{2})$/$2/', -1, trim(put(val1,8.))),best.);
format val2 z2.;
datalines;
1234
237
754
9000
40
;
You can also use the reverse and substr functions as follow and then use the z2. format.
input(reverse(substr(reverse(strip(put(var1,best.))), 1,2)), 8.);

SAS print ASCII value of special character

I am using the notalnum function in SAS. The input is a db field. Now, the function is returning a value that tells me there is a special character at the end of every string.
It is not a space character, because I have used COMPRESS function on the input field.
How can I print the ACII value of the special character at the end of each string?
The $HEX. format is the easiest way to see what they are:
data have;
var="Something With A Special Char"||'0D'x;
run;
data _null_;
set have;
rul=repeat('1 2 3 4 5 6 7 8 9 0 ',3); *so we can easily see what char is what;
put rul=;
put var= $HEX.;
run;
You can also use the c option on compress (var=compress(var,,'c');) to compress out control characters (which are often the ones you're going to run into in these situations).
Finally - 'A0'x is a good one to add to the list, the non-breaking space, if your data comes from the web.
If you want to see the position of the character within the ascii table you can use the rank() function, e.g.:
data _null_;
string = 'abc123';
do i = 1 to length(string);
asc = rank(substr(string,i,1));
put i= asc=;
end;
run;
Gives:
i=1 asc=97
i=2 asc=98
i=3 asc=99
i=4 asc=49
i=5 asc=50
i=6 asc=51
Joe's solution is very elegant, but seeing as my hex->decimal conversion skills are pretty poor I tend to do it this way.

need to substring out numeric data from a character field in SAS

I have a data set with 2 variables: a subject id number and a result. The result is a character variable. It was read in from an excel spreadsheet. Most results are numbers, but some of the results have a letter after them which was serving as a footnote in the excel file. I need to get rid of the letters after the numbers so I can convert the data to numeric for analysis. How can I do this? Below is some code to create an example dataset of the structure that I'm talking about.
data test;
input id result $ ;
datalines;
1 13
2 15
3 20
4 25c
5 75
6 99c
7 89b
8 10a
9 100
10 67
;
run;
Have a look at the compress and input functions.
num = input(compress(result, , "dk"), best.);
input converts character to numeric, interpreting the data using the informat you provide (best. here).
compress can be used to strip certain characters from a string, here it is used with the d modifier to request that all numeric digits be excluded, and the k modifier to request that the selected characters be kept rather than removed.
You may have to tweak the compress arguments a bit to deal with more complicated cases such as decimal points.

how many values are missing for each observation

I have data as follows:
ID date shoesize shoetype
1 4/3/12 . bball
2 . 12 running
3 1/2/12 8 .
4 . 9.5 bball
I want to count the number of '.' there are in each row and make a frequency table with the information. Thanks in advance
You can determine the number of missing values in a row with the NMISS and CMISS functions (NMISS for numeric, CMISS for character). If you have a list of just some of your variables, you should use that list; if not, you need to deal with the fact that number_missing itself will be missing (the -1 there).
data want;
set have;
number_missing=nmiss(of _numeric_) + cmiss(of _character_)-1;
run;
Then do whatever you want with that new variable.
NMISS doesn't work if you wish to evaluate character variables. It converts character variables in the list of arguments to numeric which results in a count being made of missing in every instance that a character variable is encountered. CMISS doesn't convert character variable values to missing and therefore you get the correct answer.
Obviously you can choose not to include the character variables as your arguments, however I am assuming that you want to count missing values in character variables as well, based on the sample you provided. If this is the case the following should do what you want.
DATA WANT3;
SET HAVE;
NUMBER_MISSING = 0;
NUMBER_MISSING=CMISS(OF _ALL_);
RUN;
You must allocate a value to NUMBER_MISSING, otherwise the new variable is also evaluated as a missing.