I am trying to convert a character column to numeric and I have tried using:
var=input(var,Best12.);
var=var*1;
Both of them returned character columns, and there is only 1 warning message:
"Character values have been converted to numeric values at the places given by: (Line):(Column). 7132:4".
Is there another what to do this conversion inside SAS?
(my apologies if this is trivial)
Thanks!
What you're doing will work if you assign the result to a new variable:
data tmp;
char='1';
run;
data tmp;
set tmp;
num=char*1;
run;
proc contents; run;
Related
In SAS DI when I connect a user written transformation to an output table, the variable _OUTPUT_connect is assigned. In my case it looks something like this:
%let _OUTPUT_connect = DEFER=YES READBUFF=25000 DBCLIENT_MAX_BYTES=1 DB_LENGTH_SEMANTICS_BYTE=NO PATH=MY_PATH AUTHDOMAIN="MY_AUTH_DOMAIN"
Now I'm trying to extract the PATH and AUTHDOMAIN variables from _OUTPUT_connect. My solution for now is the following:
%let _authdomain = %sysfunc(scan(&_OUTPUT_connect,7," "));
%let _path = %sysfunc(scan(%sysfunc(scan(&_OUTPUT_connect,5," ")),2,"="));
This works but it breaks if the order of the _OUTPUT_connect variables changes.
I thought I'd use regex to match the paramater values: PATH=[match_this] and AUTHDOMAIN="[match_this]", but I have problems parsing the variable _OUTPUT_connect because it contains double quotes. When I manually assign _OUTPUT_connect without the double quotes I can do the following
data _null_;
re = prxparse('/PATH=(\w)*/');
string = "&_OUTPUT_connect";
position = prxmatch(re, string);
put position=;
matched_pattern=prxposn(re, 0, string);
put matched_pattern=;
run;
Output:
position=75
matched_pattern=PATH=A1091211_SAS_SRV
The problem however is that _OUTPUT_connect contains double quotes, and the regex function fails when the input string contains double quotes. Since _OUTPUT_connect is assigned automatically, I cannot change the format.
I've tried to remove the double quotes from _OUTPUT_connect using this %let unquoted =%sysfunc(translate(%quote(&test),' ','"'));. This does work, but it puts a whitespace in place of the double quotes.
Is there an easy way to retrieve the values of PATH and AUTHDOMAIN from _OUTPUT_connect?
You can extract the name value pairs of the connection string by using SCAN with modifiers.
Example:
data nvps(label='name value pairs' keep=name value);
s = 'name1=value1 name2="value2" name3="value 3"';
do index = 1 to countw(s,' ','q');
nvp = scan(s,index,' ','q');
name = scan(nvp,1,'=','q');
value = scan(nvp,2,'=','q');
output;
end;
run;
In a data step process, I create a new variable with character value. However, character values in the output data got reduced from "CL_FA_IF" to "CL". What is the problem?
else if var in ("CL", "FA", "IF") then var2 = "CL_FA_IF"
I'm trying to subset some data with the following code:
data want;
set have;
array fx(12) fx1-fx12;
do i=1 to 12;
if substr(dx(i),1,4) in ('1115')
or substr(fx(i),1,5) in ('1146%')
then output;
end;
run;
I cross reference the data output using proc freq to the original dataset. The frequency counts for '1115' matches as they should. They don't for '1146%'. I thought '%' is a wildcard that I can use?
I also tried '/^1146\d*/'
The % wildcard is recognized by the WHERE LIKE operator. For the IF statement you will want to use the string prefix equality (i.e. starts with) operator =: or the prefix in set operator IN:
Also, since you are just substr 5 characters, you could substr 4 characters and check = '1146'. Furthermore, since you are substr from position 1 (1st character) you won't need to do substr at all (see 3rd example) when using IN:.
In order to use Perl regular expression pattern matching use the PRXMATCH function. Your pattern '/^1146\d*/' does not need \d* (0 or more digits). '/^1146/' will match anything that '/^1146\d*/' does.
Example(s):
if substr(dx(i),1,4) in ('1115') or fx(i) =: '1146' then output;
if substr(dx(i),1,4) in ('1115') or substr(fx(i),1,4) = '1146' then output;
/* expanded example for case of checking two prefix possibilities */
if dx(i) in: ('1115') or fx(i) in: ('1146', '124') then output;
if dx(i) =: '1115' or prxmatch('/^1146/', fx(i)) then output;
My variable is '123 - How to convert char with substring'.
The result I need to get is a variable 123 with numeric type.
Substring(myvariable,1,3) How to get it numeric? Thank you!
my_var_numeric = input(substr(my_var, 1, 3), 8.);
Here is another option. This one will keep only numbers regardless of its length or position in the string.
data w;
string = '1230 - How to convert char with substring';
number = input(compress(string,'0123456789','k'),best.);
output;
run;
I am creating a SAS dataset from a database that includes a VARCHAR(5) key field.
This field includes some entries that use all 5 characters and some that use fewer.
When I import this data, I would prefer to pad all the shorter entries out to use all five characters. For this example, I want to pad on the left with 0, the character zero. So, 114 would become 00114, ABCD would become 0ABCD, and EA222 would stay as it is.
I've attempted this with a simple data statement, but of course the following does not work:
data test;
set databaseinput;
format key $5.;
run;
I've tried to do this with a user-defined informat, but I don't think it's possible to specify the ranges correctly on character fields, per this SAS KB answer. Plus, I'm fairly sure proc format won't let me define the result dynamically in terms of the incoming variable.
I'm sure there's an obvious solution here, but I'm just missing it.
Here is an alternative:
data padded_data_dsn; length key $5;
drop raw_data;
set raw_data_dsn(rename=(key=raw_data));
key = translate(right(raw_data),'0',' ');
run;
Data raw_data_dsn;
format key $5.;
key = '4'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
key = 'A114'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
key = 'A1140'; key1 = CATT(REPEAT('0',5-length(key)),key);output;
run;
I'm sure someone will have a more elegant solution, but the following code works. Essentially it is padding the variable with five leading zeros, then reversing the order of this text string so that the zeros are to the right, then reversing this text string again and limiting the size to five characters, in the original order but left-padded with zeros.
data raw_data_dsn;
format key $varying5.;
key = '114'; output;
key = 'ABCD'; output;
key = 'EA222'; output;
run;
data padded_data_dsn;
format key $5.;
drop raw_data;
set raw_data_dsn(rename=(key=raw_data));
key = put(put('00000' || raw_data ,$revers10.),$revers5.);
run;
Here's what worked for me.
data b (keep = str2);
format str2 $5. ;
set a;
catlength = 4 - length(str);
cat = repeat('0', catlength);
str2 = catt(cat, str);
run;
It works by counting the length of the existing string, and then creating a cat string of length 4 - that, and then appending the cat value and the original string together.
Notice that it screws up if the original string is length 5.
Also - it won't work if the input string has a $5. format on it.
data a; /*input dataset*/
input str $;
datalines;
a
aa
aaa
aaaa
aaaaa
;
run;
data b (keep = str2);
format str2 $5. ;
set a;
catlength = 4 - length(str);
cat = repeat('0', catlength);
str2 = catt(cat, str);
run;
input:
a
aa
aaa
aaaa
aaaaa
output:
0000a
000aa
00aaa
0aaaa
0aaaa
I use this, but only works with numeric values :S. Try with another formats in the INPUT
data work.prueba;
format xx $5.;
xx='1234';
vv=PUT(INPUT(xx,best5.),z5.);
run;