what is the difference between value and invalue statements in SAS? - sas

i am studying for the SAS base exam, i come from medical background. I intend to learn SAS clinical.
what is the difference between invalue and value statements ?
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473466.htm
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473472.htm

The VALUE statement in PROC FORMAT is used to define a FORMAT. The INVALUE statement is used to define an INFORMAT.
In SAS you use a FORMAT to convert values into text and an INFORMAT to convert text into values. You can use a FORMAT with the PUT statement or the PUT(), PUTN() or PUTC() functions. You can attach a format to a variable using a FORMAT statement. You can use an INFORMAT with the INPUT statement or the INPUT(), INPUTN() or INPUTC() functions. You can attach an informat to a variable using the INFORMAT statement.

Related

SAS and DBSASTYPE

I have this working for one column:
PROC APPEND base=MMUSAGE.bc_ent_jas_radius (dbsastype=(max_date =date9.))
data=work.radius_master force;
RUN;
But I have a second column name MIN_DATE that I also want to format as date9. on the PROC APPEND. Is it possible?
DBSASTYPE is for going the other way. When you are reading from an external database you are telling SAS what type to translate into. It wants a type name (like you would use in PROC SQL) not a format specification.
DBTYPE is the option that tells SAS what type to create in the external database. You need to specify the type using the syntax of the remote database. It will only have any effect if the BASE dataset did not already exist before the PROC APPEND step.
If you just want to attach the DATE9. format to MAX_DATE use a FORMAT statement.
PROC APPEND base=MMUSAGE.bc_ent_jas_radius data=work.radius_master force;
format max_date date9. ;
RUN;

How do I format new variables that I Create?

I am trying to create a new variable in SAS. I use if then logic to create a new character variable. However, the variable is being truncated. How do I format the new variable so that all the characters appear?
DATA Clinic;
set stat201.clinic;
rename age_at_consent=Age ldiastolic=dbp lsystolic=sbp ldobp=datebp;
if smoking="" then smoking="Missing";
if smokecat=0 then smokecat_1="Never Smoker";
if smokecat=1 then smokecat_1="Current Everyday Smoker";
if smokecat=2 then smokecat_1="Former Smoker";
if smokecat=9 then smokecat_1="Never Assessed";
attrib smokecat_1 format =$25.;
drop smokecat;
rename smokecat_1= smokecat;
run;
SAS will define the variable based on when it first appears. Since the first appearance is in this assignment statement.
if smokecat=0 then smokecat_1="Never Smoker";
It will be defined as a character variable of length 12.
Just define the variable BEFORE using it. You can use a LENGTH statement
length smokecat_1 $25;
or an ATTRIB statement to define a variable.
attrib smokecat_1 length=$25;
Attaching the $25. format does not change the length of the variable.
It just means that you want SAS to use 25 characters to display the values. But the variable will still only be 12 characters long. There is no need to attach any format to the variable. Formats are instructions on how to display values and SAS already knows how to display character values.
To complement Tom's answer above:
SAS will define the variable type and length based on when it first appears.
I agree it is a good practice to define a variable before using it.

formatting variables and then recoding

I started out formatting my variables using PROC FORMAT. Later on I found that I had to change some of my variables in my dataset. I want to maintain the formatting I originally created, but I don't think I can do this if I recode. Am I correct in assuming this? I think I will have to just change some of my formats to accommodate my new variables, but is there a way
I'm not quite sure I understand your question, but I think I can still answer your question by giving you an understanding of the difference between recoding variables in SAS and using formatted values.
If you have originally created a format, that format is applied to the values in the SAS dataset at the time that your analysis is run. So, if you have a value of "Block A" in a character variable in your dataset and you have formatted value that maps "Block A" to the formatted value of 1, then if you go in and later change the value of "Block A" to something else and rerun your analysis, "Block A" will not longer be printed in your output or used in your analysis as the formatted value. Formats work independently of the underlying values in your datasets. When you run an analysis SAS essentially looks through your datasets at run-time and maps each of the values to the formatted values as you've specified in your proc format statement and then performs the analysis using the formatted values.
If you want to keep the original formatting, you can use two separate formats: one for the old format and one for the new formatting and call the appropriate format into your procedures depending on when you want to use which format.
You can also use a put statement in a datastep to convert the previously formatted value and "hard code" the formatted value as an actual value in your dataset. For example, if you have a format called "blockno" that you used with a variable called "block" then, using your old format, you could create a variable called blockno_old and set it to the old formatted value with:
block_old=put(block, $blockno.).
You could then modify block with your new values. You would then have to variables in your dataset: block_old which would contain the original values of your variable and block which, after your changes, would contain the new values.
Proc Format is not a format statement
With proc format, you create formats, you do not assign them to variables. That you can do for instance with a format statement.
The format of a variable is not its internal length
A SAS variable can only have two types: numerical (which non SAS programmers call double) or chracter (which non SAS programmers call fixed length character) It can however have hundreds of different formats. The format just determines the way the variable is represented in a report.
You can perfectly change the format of a variable without changing it's length.
Try this:
proc format;
value myFormat
0-10 = 'small'
10-20 ='medium'
20-100='large' ;
run;
data test1;
infile datalines;
length myVar 8.;
input myVar;
format myVar 6.2;
datalines;
1
2.1
9.12
10.123
15.1234
22.12345
50.123456
;
data test2;
set test1;
format myVar myFormat.;
data test3;
set test2;
format myVar 12.6;
run;
title 'In test1, myVar has format 6.2';
proc print data=test1;
run;
title 'In test2, myVar has format myFormat';
proc print data=test2;
run;
title 'In test3, myVar has format 12.6';
proc print data=test3;
run;
You can create a format in a format catalog and store it for any future reference. It always happens that the dataset has new variables and updated variables with new data. So having a format catalog to accommodate the new and old changes will actually help to maintain history of the original and current values.

How to read excel files containing brackets in header in sas?

I have an excel file where open of the columns is temperature (F) and then when I import it in sas it saves variable name as temperature_F_ or when I use validvarany option it saves exactly as temperature (F). However, I need to now convert the data in C. So whenever I use either of the variable name (i.e temperature_F_ or temperature (F)) it does not work. For the second one, it thinks temperature as functions. So wats the way around this one?
The exact nature of your problem isn't clear, as temperature_F_ should be fine if you've imported under validvarname=v7.
data want;
set have;
temperature_c_ = (5/9)*((temperature_f_)-32);
run;
If you have to work with the validvarname=any; version, then you use named literals:
data want;
set have;
'temperature(c)'n = (5/9)*(('temperature(f)'n)-32);
run;
Similar to a date literal (ie, '01JAN2010'd) but for member/variable/etc. names.

How to convert date in SAS to YYYYMMDD number format

In test_1 table, the my_date field is a "DATE9." format.
I would like to convert it to a pure numeric format (number length 8) which is of the form YYYYMMDD.
I would also like to do this in a proc sql statement ideally.
Here's what I have so far.
Clearly I need something to manipulate the my_date field.
rsubmit;
proc sql;
CREATE TABLE test_2 AS
SELECT
my_date
FROM
test_1
;
quit;
endrsubmit;
FYI: I am finding it quite difficult to understand the various methods in SAS.
To clarify, the field should actually be a number, not a character field, nor a date.
If you want the field to store the value 20141231 for 31DEC2014, you can do this:
proc sql;
create table want as
select input(put(date,yymmddn8.),8.) as date_num
from have;
quit;
input(..) turns something into a number, put(..) turns something into a string. In this case, we first put it with your desired format (yymmddn8. is YYYYMMDD with no separator), and then input it with 8., which is the length of the string we are reading in.
In general, this should not be done; storing dates as numerics of their string representation is a very bad idea. Try to stay within the date formats, as they are much easier to work with once you learn them, and SAS will happily work with other databases to use their date types as well. If you want the "20141231" representation (to put it to a text file, for example), make it a character variable.
Don't.
You lose the ability to use built in SAS functions for date calculations.
SAS stores dates as numbers, 0 being Jan 1, 1960 and increments from there. Formats are used to display the formats as desired for reporting and presentation.