Rename SAS variables containing special characters - sas

After I imported a huge Excel file containing thousands of data into the SAS, I found some variables showing names with special characters. For example, I have variable names like "pre.tal" and "miss.auto.t0". SAS doesn't allow me to proceed with the statistical analysis due to the special characters in these variable names. I tried the following codes but failed to work. Any ideas? Thanks
options validvarname=any;
data one;
set one;
rename pre.tal_month_t0 = 'prenatal_month_t0'n;
run;

It is the name with special characters that needs quotes and an "n":
rename 'pre.tal_month_t0'n = prenatal_month_t0;
And by the way, if you want to see your original names in listings, you can assign a label to variables
data one;
set one;
rename 'pre.tal_month_t0'n = prenatal_month_t0;`
label prenatal_month_t0 = 'pre.tal_month_t0';
proc print data=one labels;
run;

If you are pressed for time, change the option to validvarname=v7. SAS will automatically make variable names with letters and numbers while replacing special characters and whitespace with underscores. Then you don't have to worry about using name literals ('...'n).

Using options validvarnames=v7; which will allows to use

Related

How to Handle Strings in SAS

Why is it that sometimes we need to wrap the string value in single quotes, sometimes double quotes, sometimes no quotes? This is extremely frustrating when I have to go from one proc to another, especially if it involves changing a file name or url dynamically. What is the logic behind this hideous monstrosity?
%let Name01 = John Smith;
%let Name02 = 'John Smith';
%let Name03 = "John Smith";
All three work.
%let Folder = /97network/read/Regions/Northeast/;
%let FileName = SalesTarget.xlsx;
proc import
datafile = "&Folder.&FileName."
dbms = xlsx
out = SymList replace;
sheet="Sheet1";
run;
Here, &Folder.&FileName. must be in double quotes.
filename OutFile "/06specialty/ATam/AMZN.csv";
proc http url = &urlAddress. method = "get" out = OutFile;
run;
Finally, if I want to download stock prices from Yahoo Finance, url = may take the address in single quotes, or &urlAddress. in no quotes, but you cannot use double quotes. OutFile can be in single or double quotes, but not no quotes. Then in the out = clause, you have OutFile, not &OutFile.
SAS strings are very simple. They are enclosed in either single or double quote characters.
'Hello there'
"Good-bye"
If the enclosing character appears in the string it needs to be doubled up.
'I don''t know'
To your first example it is probably your operating system that is allowing filenames to include optional quotes. On Windows and Linux the qutoes can even be required in some situations when the path includes spaces or other characters that the command shell would normally interpret as delimiters in the command line.
Adding macro logic into the program is probably a large part of your confusion. First figure out what code works for the commands you are using and then you can try to generate that code using the macro processor.
Once you introduce macro logic you need to pay attention to whether your strings are using single or double quotes. There is big difference between how macro logic interacts with single and double quote characters. Strings that are bounded by single quote characters are ignored by the macro processors. So the macro trigger characters & and % are treated as normal characters. But strings that are bounded by double quote characters will be processed.
Your second example adds the complexity of working with URL syntax. URL strings use the & character for its own purpose so you need to take care to understand how SAS is going to see the code you type and whether or not the macro processor will attempt to interpret it to insure the desired string needed for the URL will be created.
SAS has 50 years of history and a lot of the code is legacy. SAS is backwards compatible. You can still run code 30 years old with no issues. There are lots of oddities, such as quotes, that are there...and will always be there. SAS is kind of a conglomeration of ~300 languages (every proc is unique plus multiple meta-languages).
Since SAS will never change, best to just ignore the oddities.
One other thing. SAS runs on lots of O/Ss so every nuance there has to be accommodated in a mostly neutral way.

How to keep single quote when importing Excel data to SAS

I am importing an Excel spreadsheet into SAS using Proc Import:
Proc Import out=OUTPUT
Datafile = "(filename)"
DBMS=XLSX Replace;
Range = "Sheet1$A:Z";
run;
My numeric data columns contain a mixture of values held in Excel as numerics and '0 values held as text - i.e. with a leading apostrophe / single quote. When SAS imports these it treats them all the same (i.e. it returns Character strings of the values with the leading apostrophe stripped out).
This results in differences from the spreadsheet when calculations are applied (e.g. averaging) as Excel treats the '0 values as missing but SAS treats them as 0.
Is it possible to import the values as strings including the leading single quote / apostrophe, so that I can replace the '0 with missing values but keep the 0 records as 0? I would like to avoid having to manually manipulate the data in Excel as this data is drawn from an external source (don't ask...)
I doubt it. I think Excel doesn’t really consider the leading apostrophe as part of the value. It’s just a crazy way to indicate that a value is a text string (rather than numeric). When SAS imports the data, it recognizes that the quote is not part of the value. So if you’ve got an Excel column with ‘0 in some cells and 0 in others, it’s going to come in as character, and I don’t think you can tell the difference between them.
Unfortunately, the xlsx engine doesn’t support the s DBSASTYPE option. Other engines that import Excel have the DBSASTYPE option. That should allow you to tell SAS to import a column as a numeric variable, even if it sees character values. If it’s the case that you want all text values in the cell converted to missing, that might do the trick. But it’s possible it would still treat ‘0 the same as 0. I’m away from SAS, so can’t test.
Option:
The ~ (tilde) format modifier enables you to read and retain single quotation marks.
http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a003209907.htm
Is it possible to convert the .xlsx to .txt keeping the single quotes? Because it is not possible to infile xlsx in a data step.
filename df disk 'C:\data_temp\ex.txt';
data test;
infile df firstobs=2;
input ID $2. x ~$3. ;
run;
proc print data=test;
run;

Permanently Reformat Variable Values in SAS

I am trying to reformat my variables in SAS using the put statement and a user defined format. However, I can't seem to get it to work. I want to make the value "S0001-001" convert to "S0001-002". However, when I use this code:
put("S0001-001",$format.)
it returns "S0001-001". I double-checked my format and it is mapped correctly. I import it from Excel, convert it to a SAS table, and convert the SAS table to a SAS format.
Am I misunderstanding what the put statement is supposed to be doing?
Thanks for the help.
Assuming that you tried something like this it should work as you intended.
proc format ;
value $format 'S0001-001' = 'S0001-002' ;
run;
data want ;
old= 'S0001-001';
new=put(old,$format.);
put (old new) (=:$quote.);
run;
Make sure that you do not have leading spaces or other invisible characters in either the variable value or the START value of your format. Similarly make sure that your hyphens are actual hyphens and not em-dash characters.

SAS: Where statement not working with string value

I'm trying to use PROC FREQ on a subset of my data called dataname. I would like it to include all rows where varname doesn't equal "A.Never Used". I have the following code:
proc freq data=dataname(where=(varname NE 'A.Never Used'));
run;
I thought there might be a problem with trailing or leading blanks so I also tried:
proc freq data=dataname(where=(strip(varname) NE 'A.Never Used'));
run;
My guess is for some reason my string values are not "A.Never Used" but whenever I print the data this is the value I see.
This is a common issue in dealing with string data (and a good reason not to!). You should consider the source of your data - did it come from web forms? Then it probably contains nonbreaking spaces ('A0'x) instead of regular spaces ('20'x). Did it come from a unicode environment (say, Japanese characters are legal)? Then you may have transcoding issues.
A few options that work for a large majority of these problems:
Compress out everything but alphabet characters. where=(compress(varname,,'ka') ne 'ANeverUsed') for example. 'ka' means 'keep only' and 'alphabet characters'.
UPCASE or LOWCASE to ensure you're not running into case issues.
Use put varname HEX.; in a data step to look at the underlying characters. Each two hex characters is one alphabet character. 20 is space (which strip would remove). Sort by varname before doing this so that you can easily see the rows that you think should have this value next to each other - what is the difference? Probably some special character, or multibyte characters, or who knows what, but it should be apparent here.

Unmatched quotation mark issue in SAS

As is known to all, SAS needs special care to quotation marks inside a sentence.
E.g.
%let quoted="I'd like to";
data temp;
set temp;
quoted="&quoted";
run;
error is encounterred when submitting.
In fact I need to copy data to one dataset from another one, in which there are a lot of records containing quotation marks. When assigning, error occurrs and data step stop executing, causing rest of the code to be invalid. So in this case, it's impossible to modify original data set by adding duplicated quotation marks, which doesn't make sense.
So instead of having to add a duplicated one, like, "I''d like to", is there any other way of avoiding the error, or making data step keeping executing?
Thanks,
When using the macro language (including the %let command) you do not want to use quotes to identify text strings. To place a single quote in a string you must use one of the macro utility masking functions such as %str(). The correct syntax to place a single unmatched quote in a macro variable using %let is shown below. The % symbol before the single quote is an escape character to tell SAS that the following character (a single quote) should be used as a literal. Also note that I've removed the double quotes from the %let as they are not required.
%let quoted=%str(I%'d like to);
data temp;
quoted="&quoted";
run;
Cheers
Rob
I'm not sure what you're trying to achieve in the actual situation, but in the above situation it can be solved removing the double quotation marks in the data step.
%let quoted="I'd like to";
data temp;
set temp;
quoted=&quoted;
run;