Deleting underscore in a SAS column - sas

I have a data set like this
Country Year Value
with a lot of observations. All the years in the column Year are preceded by an underscore, like _1990 for example. But I'd like to delete the underscore because in the other datasets I need it is written like 1990.
I tried the following
data x;
set x;
by Year;
replace _ = "";
RUN;
But it didn't work.
I specify that I am new using SAS.
Looking forward for your help,

This will remove _ character and create new variable
Data want;
set have;
YEAR_NEW=compress(YEAR,"_","i");
run;

Related

Unable to convert a character variable with numbers with a comma into numeric

I have a set of variables in SAS that should be numeric but are characters. Numbers are comma separated and I need a point. For example, I need 19,000417537 to be 19.000417537. I tried translate without success. the comma is still there and I'm not able to convert the variable to numeric using input(). Can anyone help me please?
Thank you in advance
Best
Use INPUT() with the COMMAX informat.
data have;
length have $20.;
have = "19,000417537";
want = input(have, commax32.);
format want 32.8;
run;
proc print data=have;
run;
Obs have want
1 19,000417537 19.00041754
In two steps you can replace the , with . with tranwrd and then use input to convert it to numeric.
data yourdf;
set df;
charnum2=tranwrd(charnum, ",", "."); /*replace , with .*/
numvar = input(charnum2, 12.); /*convert to numeric*/
run;
You can use the COMMA informat to read strings with commas in them. But if you want it to treat the commas as decimal points instead of ignoring them then you probably need to use COMMAX instead (Or perhaps use the NLNUM informat instead so that the meaning of commas and periods in the text will depending on your LOCALE settings).
So if the current dataset is named HAVE and the text you want to convert is in the variable named STRING you can create a new dataset named WANT with a new numeric variable named NUMBER with code like this:
data want;
set have;
number = input(string,commax32.);
run;

How to create a character variable out of 2 others where the first word is upcase and the other in brackets in SAS?

I have a table in sas and I want to create a new column C with a variable that should be computed by A and B, A should be in upcase letters and B in brackets.
If A is dog and B is cat then the C in that row should be DOG (cat).
I' m very new to sas, how can I do that?
I know that I can get upcase by upcase(A), but I don't know how I can have 2 character variables after one another to create a new variable and how to put a new variable in brackets.
SAS has a series of CAT.() functions that make that simple. CATS() strips the leading/trailing spaces from the values. CATX() allows you specify a value to paste between the values.
data want ;
set have;
length new $100 ;
new=catx(' ',upcase(a),cats('[',b,']'));
run;
Personally, I'm using cat/cats/catx only in very specific cases. For a problem like this, you can simply use the concatenate operator || that will make the code much more easier to understand:
data want;
set have;
attrib new format=$100.;
new = strip(upcase(a)) || " (" || strip(b) || ")";
run;
OK, that's maybe a little bit more verbose, but I think that's also more easy to understand for a new SAS programmer :)

Convert string into numeric and change period to comma seperator sas

I have a string called weight that is 85.5
I would like to convert it into a numeric 85,5 and replace the decimal seperator with a comma using SAS.
So far I am using this (messy) two step approach
weight_num= (weight*1);
format weight_num COMMAX13.2;
How can this be achieved in a less clumpsy way??
Your sample code is the recommended method of changing a variable type.
Another way is transtrn function to replace the . with a comma. This is only a good method if you don't plan to do any calculations on the values.
data have;
set sashelp.class;
keep name weight:;
weight_char=put(weight, 8.1);
run;
data want;
set have;
weight_char=transtrn(weight_char, ".", ",");
run;
proc print data=want;
run;
If you just want to change it so that commas are used for decimal point instead of periods then why not just use a simple character substitution. Do you also want to change thousands separator from comma to period? TRANSLATE() is good for that.
weight = translate(weight,',.','.,');
If you want to convert it to a number then use the INPUT() function rather than forcing SAS to convert for you.
weight_num = input(weight,comma32.);
You can then attach whatever format you want to the new numeric variable.

SAS: How to delete word between two specific position?

data:
Hell_TRIAL21_o World
Good Mor_Trial9_ning
How do I remove the _TRIAL21_ and _TRIAL9_?
What I did was I find the position of the first _ and the second _. Then I want to compress from the first _ and second _. But the compress function is not available to do so. How?
x = index(string, '_');
if (x>0) then do;
y = x+1;
z = find(string, '_', y);
end;
Text= " Hell_TRIAL21_o World Good Mor_Trial9_ning"
var= catx("",scan(text,1,"_"),"__",scan(text,3,"_"),"_", scan(text,5,"_"))
Note that the length of variable var may not be desirable to your case.Remember to adjust accordingly.
PERL regular expressions are a good way of identifying these sort of strings. call prxchange is the function that will remove the relevant characters. It requires prxparse beforehand to create the search and replace parameters.
I've used modify here to amend the existing dataset, obviously you may want to use set to write out to a new dataset and test the results first.
data have;
input string $ 30.;
datalines;
Hell_TRIAL21_o World
Good Mor_Trial9_ning
;
run;
data have;
modify have;
regex = prxparse('s/_.*_//'); /* identify and remove anything between 2 underscores */
call prxchange(regex,-1,string);
run;
Or to create a new variable and dataset, just use prxchange (which doesn't require prxparse).
data want;
set have;
new_string = prxchange('s/_.*_//',-1,string);
run;

Strip apostrophes from a character string (compress?)

I have a string which looks like this:
"ABAR_VAL", "ACQ_EXPTAX_Y", "ACQ_EXP_TAX", "ADJ_MATHRES2"
And I'd like it to look like this:
ABAR_VAL ACQ_EXPTAX_Y ACQ_EXP_TAX ADJ_MATHRES2
I.e. no apostrophes or commas and single space separated.
What is the cleanest / shortest way to do so in SAS 9.1.3?
Preferably something along the lines of:
call symput ('MyMacroVariable',compress(????,????,????))
Just to be clear, the result needs to be single space separated, devoid of punctuation, and contained in a macro variable.
Here you go..
data test;
var1='"ABAR_VAL", "ACQ_EXPTAX_Y", "ACQ_EXP_TAX", "ADJ_MATHRES2"';
run;
data test2;
set test;
call symput('macrovar',COMPBL( COMPRESS( var1,'",',) ) );
run;
%put &macrovar;
Is this part of an infile statement or are you indeed wanting to create macro variables that contain these values? If this is part of an infile statement you shouldn't need to do anything if you have the delimiter set properly.
infile foo DLM=',' ;
And yes, you can indeed use the compress function to remove specific characters from a character string, either in a data step or as part of a macro call.
COMPRESS(source<,characters-to-remove>)
Sample Data:
data temp;
input a $;
datalines;
"boo"
"123"
"abc"
;
run;
Resolve issue in a data step (rather than create a macro variable):
data temp2; set temp;
a=compress(a,'"');
run;
Resolve issue whilst generating a macro variable:
data _null_; set temp;
call symput('MyMacroVariable',compress(a,'"'));
run;
%put &MyMacroVariable.;
You'll have to loop through the observations in order to see the compressed values the variable for each record if you use the latter code. :)
To compress multiple blanks into one, use compbl : http://www.technion.ac.il/docs/sas/lgref/z0214211.htm