SAS ALTER TABLE MODIFY Length - sas

Suppose I have in SAS someTable with a column someColumn of type Character.
I can adjust length, format, informat and label in the following way:
ALTER TABLE WORK.someTable
MODIFY someColumn char(8) format=$CHAR6. informat=$CHAR6. label='abcdef'
But I doubt if this is the correct way for the following reasons:
It seems pointless that the syntax requires the type char because column type can't be changed with a MODIFYstatement.
This code does not work if someColumn is of type Numeric or Date.
The syntax for changing length is inconsistent with the syntax for changing format/informat/label.
Actually, I expected the following code to work:
ALTER TABLE WORK.someTable
MODIFY someColumn length=8 format=$CHAR6. informat=$CHAR6. label='someLabel'
This code runs without errors nut does not change the length.
Question:
What is the correct syntax to modify the length of a column using ALTER TABLE / MODIFY?
(For arbitrary column type like character/numeric/date.)

The syntax for defining the altered variable ("column") is the same as the syntax PROC SQL uses for defining a variable. What the documentation calls "column-definition Component"
column data-type <column-modifier(s)>
That is why you use the SQL syntax, char(n) or num, for specifying the type. Note that SAS datasets only have two data types: fixed length character strings and floating point numbers. SAS will automatically convert any other SQL data-type into the proper one of those.
The limitations on altering the type are spelled out in the documentation:
Changing Column Attributes
If a column is already in the table, then
you can change the following column attributes by using the MODIFY
clause: length, informat, format, and label. The values in a table are
either truncated or padded with blanks (if character data) as
necessary to meet the specified length attribute.
You cannot change a character column to numeric and vice versa. To
change a column’s data type, drop the column and then add it (and its
data) again, or use the DATA step.
Note: You cannot change the length of a numeric column with the ALTER
TABLE statement. Use the DATA step instead.
Note that to make such changes to a dataset SAS will have to create a whole new dataset. So you might as well just write a data step to create the new dataset and then you will have full control.
Also be careful if you change the length of character variable to make sure that the attached FORMAT is still correct.
In your example you are changing the variable to be 8 bytes long, but are attaching a format that will only display the first 6 bytes.
In general it is best to not attach formats to character variables to avoid the confusion that type of mismatch can cause. Unfortunately there is no way to remove the attached format using PROC SQL. The best you could do is to set the format to $., that is without an explicit width. If you want to completely remove the format you will need to use a FORMAT statement in PROC DATASETS or a data step.

Related

How to read SAS format dictionary into SPSS?

I am trying to load SAS data file together with its variable and value labels, but I cant seem to make it work.
I have 3 SAS files
sas data ("data_final.sas7bdat")
sas format dictionary that contains the format name, variable name/labels, etc ("formats.sas7bdat")
sas format library that contains the format name, value name/labels,etc ("format_library.sas7bdat")
I am trying to load this to SPSS using the following code but it doesn't work. It loads the data and the variable labels but not the value labels.
GET SAS DATA='\data_final.sas7bdat'
/FORMATS='\formats.sas7bdat'
/FORMATS='\format_library.sas7bdat'.
Any help is greatly appreciated.
Thank you!
The FORMATS= option wants the name of the SAS format catalog, not another SAS dataset. Catalogs use sas7bcat as the extension.
GET SAS DATA='\data_final.sas7bdat'
/FORMATS='\formats.sas7bcat'.
If you really cannot get it to work then read in the formats_library.sas7bdat and look at the FMTNAME, TYPE, START, END and LABEL variables and use those to generate the SPSS code you need to attach data labels to your SPSS data.
FMTNAME is the name of the format. The TYPE determines if it is applies to character values or numeric values (or if in fact is an INFORMAT instead of FORMAT). The START and END mark the range of values (frequently they will be the same) and LABEL is the decoded value (aka the data label). Unlike in SPSS in SAS you only have to define the code/decode mapping once and then apply to as many variables as you want.
The dataset you show as being named formats.sas7bdat looks like it is the variable level metadata. That should list each variable (NAME) and what format, if any, has been attached to it (FORMAT). So if that shows there is a variable named FRED that has the format YESNO attached to it then look for records in format_library where FMTNAME='YESNO' and see what values it maps. So if FRED is numeric with values 1 and 2 then format YESNO might have one record with START='1' and LABEL='YES' and another with START='2' and LABEL='NO'.

Generating many tables from a single table in SAS

I have a table in SAS which contains the format information I want. I want to bin this data into the categories given.
What I don't know how to do is create either an xform or a format file from the data.
An example table looks like this:
TxtLabel Type FmtName label Hlo count
. I FAC1f 0 O 1
1996 I FAC1f 1 2
1997 I FAC1f 2 3
I want to date all years in a different data set as after 1997 OR before 1996.
The problem is that I know how to do this by hard coding it, but these files changes the numbers each time so I'm hoping to use the information in the table to generate the bins rather than hard code them.
How do I go about binning by data using a column from another dataset for my categorization?
Edit
I have two data sets, one which looks like the one I have included and one which has a column titled "YEAR". I want to bin the second data set using the categories from the first. In this case there are two available years in TxtLabel. There are multiple tables like this, I'm looking at how to generate PROC Format code from the table, rather than hard coding the values.
This should run to create the desired format
Proc FORMAT CNTLIN=MyCustomFormatControlData;
run;
You can then use it in a DATA Step, or apply it to a column in a data set.
Binning the data might be construed as 'data set splitting' but your question does not make it clear if that is so. Generic arbitrary splitting is often done with one of these techniques:
wall paper source code resolved from macro variables populated from information garnered in a Proc SQL or Proc FREQ step
dynamic data splitting using hash object for grouping records in memory, and saved to a data set with an .output() call.
Sample code for explicit binning
data want0 want1 want2 want3 want4 want5 wantOther;
set have;
* explicit wall paper;
select (put(year,FAC1f.));
when ('0') output want0;
when ('1') output want1;
when ('2') output want2;
when ('3') output want3;
when ('4') output want4;
when ('5') output want5;
otherwise output wantOther;
run;
This is the construct that source code generated by macro can produce, and requires
one pass to determine the when/output lines that are to be generated
a second pass to apply the lines of code that were generated.
If this is the data processing that you are attempting:
do some research (plenty of info out there)
write some code
make a new question if you get errors you can't resolve
Proc FORMAT
Proc FORMAT has a CNTLIN option for specifying a data set containing the format information. The structure and values expected of the Input Control Data Set (that CNTLIN) is described in the Output Control Data Set documentation. Some of the important control data columns are:
FMTNAME
specifies a character variable whose value is the format or informat name.
LABEL
specifies a character variable whose value is associated with a format or an informat.
START
specifies a character variable that gives the range's starting value.
END
specifies a character variable that gives the range's ending value.
As the requirements of the custom format to be created get more sophisticated you will need to have more information variables in the input control data set.

ERROR: Expression using IN has components that are of different data types

I am using the below query in SAS Enterprise Guide to find the count for different offer_ids customers for different dates :
PROC SQL;
CREATE TABLE test1 as
select offer_id,
(Count(DISTINCT (case when date between '2016-11-13' and '2016-12-27' then customer_id else 0 end))) as CUSTID
from test
group by offer_id
;QUIT;
ERROR: Expression using IN has components that are of different data types
Note: Here, Offer_id is the character variable whereas Custome_id is an numeric variable.
Most likely the error is caused by comparing the numeric variable DATE to the character strings '2016-11-13'. If you want to specify a date literal in SAS you must specify the date in DATE9 format and append the letter D after the close quote.
date BETWEEN '13NOV2016'd AND '27DEC2016'd
Note that there is no reference to any external database in the posted code. But even if your source table was tdlib.tdtable instead of work.test you still need to use SAS syntax when writing SAS code. Let the Teradata engine figure out how to convert it for you.
You don't make it clear whether this is being run on SAS or Teradata (via pass through).
I'm guessing SAS, in which case you are missing d after your dates (e.g. '2016-11-13'd). Without this, the dates are being treated as text instead of formatted numbers.
The error statement is slightly misleading, as SAS is treating the between statement as an in statement.

SAS - Convert numeric values to character including dates

I have a need to combine two sas datasets having the same column names but one of the datasets will have a numeric value where the same name in the other dataset are character. I was thinking to evaluate each field with the %isnum function and based on this convert the number to character:
char_id = put(id, 7.) ;
drop id ;
rename char_id=id ;
What I need to know is how do I determine the length of the variable to use in the PUT and what would I do for date fields?
Sounds like you need to analyze your data and see how long things are. Use an obviously too long format (best32.) and then see how long the actual results are, or use max.
For date fields, you need to decide how you want your date fields to look.
date_c = put(date_n,date9.);
That would be the default, but there are literally hundreds of date formats you can choose from.
You can also use proc contents data=myDataStes out=VarDatasets; run; and you will get the list of variables with type, length, format, informat and so on.

Change variable length in SAS dataset

I need to change the variable length in a existing dataset. I can change the format and informat but not the length. I get an error. The documentation says this is possible but there are no examples.
Here is my issue. My data source could change so I don't want to pre define columns on import. I want to do a generic import and then look for certain columns and adjust the length.
I have tried PROC SQL and DATA steps. It looks like the only way to do this is to recreate the dataset or the column. Which I don't want to do.
Thanks,
Donnie
If you put your LENGTH statement before the SET statement, in a Data step, you can change the length of a variable. Obviously, you will get truncation if you have data longer than your new length.
However, using a DATA step to change the length is also re-creating the data set, so I'm confused by that part of your question.
The only way to change the length of a variable in a datastep is to define it before a source (SET) dataset is read in.
Conversely you can use an alter statement in a proc sql. SAS support alter statement
Length of a variable remains same once you set the dataset. Add length statements before you set the dataset if you need to change length of a columns
data a;
length a, b, c $200 ;
set b ;
run ;