I have a string called weight that is 85.5
I would like to convert it into a numeric 85,5 and replace the decimal seperator with a comma using SAS.
So far I am using this (messy) two step approach
weight_num= (weight*1);
format weight_num COMMAX13.2;
How can this be achieved in a less clumpsy way??
Your sample code is the recommended method of changing a variable type.
Another way is transtrn function to replace the . with a comma. This is only a good method if you don't plan to do any calculations on the values.
data have;
set sashelp.class;
keep name weight:;
weight_char=put(weight, 8.1);
run;
data want;
set have;
weight_char=transtrn(weight_char, ".", ",");
run;
proc print data=want;
run;
If you just want to change it so that commas are used for decimal point instead of periods then why not just use a simple character substitution. Do you also want to change thousands separator from comma to period? TRANSLATE() is good for that.
weight = translate(weight,',.','.,');
If you want to convert it to a number then use the INPUT() function rather than forcing SAS to convert for you.
weight_num = input(weight,comma32.);
You can then attach whatever format you want to the new numeric variable.
Related
I have a set of variables in SAS that should be numeric but are characters. Numbers are comma separated and I need a point. For example, I need 19,000417537 to be 19.000417537. I tried translate without success. the comma is still there and I'm not able to convert the variable to numeric using input(). Can anyone help me please?
Thank you in advance
Best
Use INPUT() with the COMMAX informat.
data have;
length have $20.;
have = "19,000417537";
want = input(have, commax32.);
format want 32.8;
run;
proc print data=have;
run;
Obs have want
1 19,000417537 19.00041754
In two steps you can replace the , with . with tranwrd and then use input to convert it to numeric.
data yourdf;
set df;
charnum2=tranwrd(charnum, ",", "."); /*replace , with .*/
numvar = input(charnum2, 12.); /*convert to numeric*/
run;
You can use the COMMA informat to read strings with commas in them. But if you want it to treat the commas as decimal points instead of ignoring them then you probably need to use COMMAX instead (Or perhaps use the NLNUM informat instead so that the meaning of commas and periods in the text will depending on your LOCALE settings).
So if the current dataset is named HAVE and the text you want to convert is in the variable named STRING you can create a new dataset named WANT with a new numeric variable named NUMBER with code like this:
data want;
set have;
number = input(string,commax32.);
run;
I am trying to convert the variable "MAGE" from character to numeric in SAS
I tried new_MAGE = input(MAGE, informat.);
and received and an error
You need to use an actual informat specification, not the string informat..
For most values try the normal numeric informat. The INPUT() function does not care if you use a width on the informat specification that is larger than the length of the string being read. So just use:
data want;
set have;
new_MAGE = input(MAGE, 32.);
run;
If the values in MAGE have thousands separators then you might want to use the COMMA informat instead.
data want;
set have;
new_MAGE = input(MAGE, comma32.);
run;
I'm trying to use INPUT function, as it is always suggested, but it seems that SAS has some problems with proper interpretation of amounts like:
2,30
1,61
0,00
...and I end up with missing values. Perhaps it's caused by comma being thousands separator where SAS come from ;)
data temp;
old = '1,61';
new = input(old, 5.2);
run;
Why the result of above is new = .?
It seems that I've found some work-around - by replacing comma with a period using TRANWRD before INPUT function is called (vide code below), but it's quite ugly solution and I suppose there must be a proper one.
data temp;
old = '1,61';
new = input(tranwrd(old,',','.'), 5.2);
run;
The reason new = . in your example is because SAS does not recognize the comma as a decimal separator. See the note in the log.
NOTE: Invalid argument to function INPUT at line 4 column 11.
old=1,61 new=. ERROR=1 N=1
NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to
missing values.
The documentation contains a list of various SAS informats. Based on the documentation it looks like you can use the COMMAX informat.
COMMAXw.d - Writes numeric values with a period that separates every three digits and a comma that separates the decimal fraction.
The modified code looks like this:
data temp;
old = '1,61';
new = input(old,commax5.);
run;
proc print;
The resulting output is:
Obs old new
1 1,61 1.61
If you want to keep the new variable in the same format you can just add the statement format new commax5.; to the data step.
Thanks to Tom for pointing out that SAS uses informats in the INPUT() function.
data:
Hell_TRIAL21_o World
Good Mor_Trial9_ning
How do I remove the _TRIAL21_ and _TRIAL9_?
What I did was I find the position of the first _ and the second _. Then I want to compress from the first _ and second _. But the compress function is not available to do so. How?
x = index(string, '_');
if (x>0) then do;
y = x+1;
z = find(string, '_', y);
end;
Text= " Hell_TRIAL21_o World Good Mor_Trial9_ning"
var= catx("",scan(text,1,"_"),"__",scan(text,3,"_"),"_", scan(text,5,"_"))
Note that the length of variable var may not be desirable to your case.Remember to adjust accordingly.
PERL regular expressions are a good way of identifying these sort of strings. call prxchange is the function that will remove the relevant characters. It requires prxparse beforehand to create the search and replace parameters.
I've used modify here to amend the existing dataset, obviously you may want to use set to write out to a new dataset and test the results first.
data have;
input string $ 30.;
datalines;
Hell_TRIAL21_o World
Good Mor_Trial9_ning
;
run;
data have;
modify have;
regex = prxparse('s/_.*_//'); /* identify and remove anything between 2 underscores */
call prxchange(regex,-1,string);
run;
Or to create a new variable and dataset, just use prxchange (which doesn't require prxparse).
data want;
set have;
new_string = prxchange('s/_.*_//',-1,string);
run;
I have a data with commas in tab file and I have imported it the values were imported into sas as a char datatype with a comma values.
like 23,1 53,2
I want to now convert these into numeric with either . or comma how do i do it?
if I use
want=input(have,comma.);
informat want comma.;
format want comma.;
I get missing values., !
You can use the NUMXw.d informat to input numbers with commas as the decimal separator.
want = input(have,NUM4.1);
or just use that on the initial input statement and you don't have to convert it.
NUMXw.d also is a format, so you can use it to display the variable with a comma if that's how you are more comfortable viewing decimals.
You can use a TRANWRD function to replace the comma with a period, then wrap this within an INPUT function to convert the new character value to numeric.
F2 = INPUT(TRANWRD(F1,',','.'),4.1);