Change variable length in SAS dataset - sas

I need to change the variable length in a existing dataset. I can change the format and informat but not the length. I get an error. The documentation says this is possible but there are no examples.
Here is my issue. My data source could change so I don't want to pre define columns on import. I want to do a generic import and then look for certain columns and adjust the length.
I have tried PROC SQL and DATA steps. It looks like the only way to do this is to recreate the dataset or the column. Which I don't want to do.
Thanks,
Donnie

If you put your LENGTH statement before the SET statement, in a Data step, you can change the length of a variable. Obviously, you will get truncation if you have data longer than your new length.
However, using a DATA step to change the length is also re-creating the data set, so I'm confused by that part of your question.

The only way to change the length of a variable in a datastep is to define it before a source (SET) dataset is read in.
Conversely you can use an alter statement in a proc sql. SAS support alter statement

Length of a variable remains same once you set the dataset. Add length statements before you set the dataset if you need to change length of a columns
data a;
length a, b, c $200 ;
set b ;
run ;

Related

SAS ALTER TABLE MODIFY Length

Suppose I have in SAS someTable with a column someColumn of type Character.
I can adjust length, format, informat and label in the following way:
ALTER TABLE WORK.someTable
MODIFY someColumn char(8) format=$CHAR6. informat=$CHAR6. label='abcdef'
But I doubt if this is the correct way for the following reasons:
It seems pointless that the syntax requires the type char because column type can't be changed with a MODIFYstatement.
This code does not work if someColumn is of type Numeric or Date.
The syntax for changing length is inconsistent with the syntax for changing format/informat/label.
Actually, I expected the following code to work:
ALTER TABLE WORK.someTable
MODIFY someColumn length=8 format=$CHAR6. informat=$CHAR6. label='someLabel'
This code runs without errors nut does not change the length.
Question:
What is the correct syntax to modify the length of a column using ALTER TABLE / MODIFY?
(For arbitrary column type like character/numeric/date.)
The syntax for defining the altered variable ("column") is the same as the syntax PROC SQL uses for defining a variable. What the documentation calls "column-definition Component"
column data-type <column-modifier(s)>
That is why you use the SQL syntax, char(n) or num, for specifying the type. Note that SAS datasets only have two data types: fixed length character strings and floating point numbers. SAS will automatically convert any other SQL data-type into the proper one of those.
The limitations on altering the type are spelled out in the documentation:
Changing Column Attributes
If a column is already in the table, then
you can change the following column attributes by using the MODIFY
clause: length, informat, format, and label. The values in a table are
either truncated or padded with blanks (if character data) as
necessary to meet the specified length attribute.
You cannot change a character column to numeric and vice versa. To
change a column’s data type, drop the column and then add it (and its
data) again, or use the DATA step.
Note: You cannot change the length of a numeric column with the ALTER
TABLE statement. Use the DATA step instead.
Note that to make such changes to a dataset SAS will have to create a whole new dataset. So you might as well just write a data step to create the new dataset and then you will have full control.
Also be careful if you change the length of character variable to make sure that the attached FORMAT is still correct.
In your example you are changing the variable to be 8 bytes long, but are attaching a format that will only display the first 6 bytes.
In general it is best to not attach formats to character variables to avoid the confusion that type of mismatch can cause. Unfortunately there is no way to remove the attached format using PROC SQL. The best you could do is to set the format to $., that is without an explicit width. If you want to completely remove the format you will need to use a FORMAT statement in PROC DATASETS or a data step.

SAS issue with Rename variable

I have a dataset with 6 character variables including Day5,Day6,Day7,City1,City2,City3.
I am trying to rename Day5 which was extracted as i__Day5 after importing txt file into SAS. The variable i__day5 is not getting renamed to day5 and so it does not shows any observation for this variable.
data subset ;
set subset ;
rename i__Day5 = Day5;
run;
Thanks.
As Tom mentioned your problem likely stems from overwriting the original table with the modified data, and then trying to submit your code to run again.
It will work the first time when the variable i__Day5 exists, but on running it a second time, the variable will no longer exist as it has already been renamed.
To avoid this issue never re-use table names. This code would be better:
data subset2 ;
set subset ;
rename i__Day5 = Day5;
run;
Space is cheap so there's no real downside to doing this, plus it gives you an easy way to compare the table before/after running the code.
The only other issue that this could be is that you are viewing field labels and not field names. As samkart mentions, you can verify the actual field names by running a proc contents against your table.

SAS: Naming a table based on a cell value

I have an Excel file that always has the same name but the contents of the table changes. I am looking to write a code that names the table based on the value in one of the cells.
For example:
If cell A3 equals "Employment Information", I want the table to be named "Jobs".
If cell A3 equals "Inflation Information", I want the table to be named "Currency".
etc.
I want to define ONE macro (i.e. %table(filename,cell)), or ONE loop of if then else statements to achieve this. Unfortunately, I can't seem to wrap my head around this logically. If someone with experience in SAS could help me out that would be awesome. I will edit my question soon to include some codes that I have already tried but which have failed to get the job done.
You need to read the data to find the content. You could then create a macro variable to make it easy to rename the dataset using PROC DATASETS.
Let's assume you have converted the Excel sheet into a dataset named WORK.HAVE. Let's also assume that you know what variable contains the data from column A, let's call that variable A. Is there anything in the data that makes it possible to tell which observation is the one to use? For now let's just assume that by A3 you mean the second observation since the first row of the sheet should have the variable names.
So in that case you want something like this:
%let newname=have;
data _null_;
set have (firstobs=2);
if A="Employment Information" then call symputx('newname','Jobs');
else if A="Inflation Information" then call symputx('newname','Currency');
stop;
run;
proc datasets nolist lib=work;
change have=&newname;
run;
quit;

Is there a way to change the length of a variable using Proc datasets?

I have a lot of datasets and variables that I need to modify the attributes of. Everything is working fine EXCEPT for the below instance where I I need to change the length of a variable:
data inputdset ;
format inputvar $20. ;
inputvar='ABCDEFGHIJKLMNOPQRST' ;
run ;
proc datasets lib=work nolist memtype=data ;
modify inputdset ;
attrib inputvar format=$50. length=50 ;
run ;
quit ;
Running this gives the following notes in the log:
NOTE: The LENGTH attribute cannot be changed and is therefore being ignored.
Blockquote
NOTE: MODIFY was successful for WORK.INPUTDSET.DATA.
...the final inputvar has a format of $50. as expected but still has a length of 20. Is there a way to have the length increased for these cases using proc datasets (or even better, if the length can be increased to match format)?
It's always risky to say no, but I'm going to try it. PROC DATASETS can modify the metadata about a dataset, not the data stored in each record. Changing the length of a variable requires changing the value stored in every record for that variable (truncating it or lengthening it and padding with blanks). Thus changing the length of a variable requires rewriting the entire dataset, which can be done by the DATA step or PROC SQL, but not PROC DATASETS.
Just a note that the documentation does specify that Length cannot be changed by the attrib statement under restrictions and the MODIFY statement.
https://support.sas.com/documentation/cdl/en/proc/68954/HTML/default/viewer.htm#n0ahh0eqtadmp3n1uwv55i2gyxiz.htm
MODIFY Statement
Changes the attributes of a SAS file and, through the use of subordinate statements, the attributes of variables in the SAS file.
Restriction: You cannot change the length of a variable using the LENGTH= option in an ATTRIB statement
To change the length of a column in a dataset you will have to rebuild the dataset with the new length. Typically you would do this using a datastep code pattern like the following(proc sql is another option).
data inputdset ;
format inputvar $20. ;
inputvar='ABCDEFGHIJKLMNOPQRST' ;
run ;
data inputdset;
length inputvar $ 50;
format inputvar $50.; * can change the format at the same time if you want;
set inputdset;
run;
The most common complaint with this pattern is that inputvar will now be the first column in the new dataset. You can correct this by properly listing all the variables in the length statement to preserve the original order.

Probt in sas for column of values

Im looking do a probt for a column of values in sas not just one and to give two tailed p values.
I have the following code Id like to amend
data all_ssr;
x=.551447;
df=25;
p=(1-probt(abs(x),df))*2;
put p=;
run;
however I would like x to be a column of values within another file. I have tried work.ttest which is just a file of ttest values.
Many thanks
You need to use a set statement to access data from another SAS dataset.
data all_ssr;
set work.ttest; /*Dataset containing column of values*/
df=25;
p=(1-probt(abs(x),df))*2;
run;
Removing the put statement avoids clogging up the log.