I have a requirement set whereby some of the SAS numeric columns must be able to store numeric value that is more than 16 digits. For example:
123,456,789,123,456,789,123,123.9996
It is actually 24.4 by looking at that.
I've studied a few pages such as :
http://www.sfu.ca/sasdoc/sashtml/unixc/z0344718.htm
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/p0ji1unv6thm0dn1gp4t01a1u0g6.htm
http://v8doc.sas.com/sashtml/win/numvar.htm#:~:text=The%20maximum%20number%20of%20variables,can%20be%20is%20160%20bytes.
It seems to me that the maximum numeric length that SAS support is 8 bytes which can only support 16 digits whole number. Is there a way to achieve numeric value that is "24.4" like the above example?
Related
I'm working in SAS EG and I'm trying to convert a column that's in character format to numeric format, EXACTLY as they appear in their character format. The numbers vary in length and some have one or two leading zeros.
If I do it one way, it gets rid of all leading zeros. Another way I tried, it adds leading zeros to the point that it's as long as the longest number in the column, e.g., a 9-digit number with one leading zero now has four leading zeros because the longest number in the column is 12 digits. (I hope this description makes sense).
I'm working in SAS EG. When I run proc contents, it tells me my existing variable is a character variable of length 26. It is blank for both 'format' and 'informat.'
I need to convert it so that a new column is a numeric variable, with length 8, and 'F12.' for 'format' and 'BEST12.' for 'informat,' as I plan to use it to match two data sets.
I created the following test data set in 'regular' SAS, but I'm not sure if fully recreates the issue I'm working on in SAS EG:
data have;
input mrn $1-12;
cards;
118283586928
003875807
038087875
0385709873
0038576830
;
run;
As you can see, I have one number that's 12 digits long (no leading zeros); two that are 9 digits (with one or two leading zeros); and two that are 10 digits (with one or two leading zeros).
Any help would be greatly appreciated.
Thanks
You cannot store 26 digit strings exactly as a number in SAS. SAS stores numbers as floating point values. You can use the CONSTANT() function to see the end of the contiguous integers that can be stored exactly.
73 data _null_;
74 x=constant('exactint');
75 put x= comma30.;
76 run;
x=9,007,199,254,740,992
So if you actually have values longer than 15 digits in the character variable you will not be able to convert them to numbers.
But if they are only 12 digits long then just convert the strings into numbers and compare the numbers.
proc sql;
create table want as
select *
from a, b
where a.mrn = input(b.mrn_string,32.)
;
quit;
It's not possible to have different formats in the same column in SAS. The only way to keep them looking exactly as they do while in the same column is to keep them as text. If you need to do calculations on them I'd suggest just creating a 2nd column with their numeric values.
Leading zeros can be added to numbers using the z. format.
https://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000205244.htm
I have a column with a length of 8. However, when I view the data, the value in the column is above 8 characters long.
See the following screenshot:
Your mixing up Length and Format.
http://blogs.sas.com/content/sasdummy/2007/11/20/lengths-and-formats-the-long-and-short-of-it/
Length: The column length, in SAS terms, is the amount of storage allocated in the data set to hold the column values. The length is specified in bytes. For numeric columns, the valid lengths are usually 3 through 8. The longer the length, the greater the precision allowed within the column values. For character columns, the length can be 1 through 32767. For single-byte data values, that equates to the number of characters the column can hold. For multibyte data values (DBCS, Unicode, or UTF-8), where a character can occupy more than one byte, the number of characters that fit might be less than the length value of the column.
Format: The column format, in SAS terms, is a basically an instruction for how to transform a raw value into an appearance that is suitable for a given purpose. A basic attribute of a format is the format length, which controls how much of the value is displayed. For example, a character column might have a storage length of 10 bytes, but a format length of 5 characters ($5. format), so when you see the formatted values you will see at most 5 characters for each record.
I am new to SAS and currently working on a small piece of work with SAS.
Could I please ask what the below format means? I believe the 8. is formatting two digits to the right of the decimal place such as 896.33 but I am not sure. Not really sure what input means.
input(tablename.fieldname, 8.)
That is an INFORMAT, not a FORMAT. It means to read the first 8 characters as a number. If there is a decimal point in the data then it is used naturally. You could have up to 7 digits to the right of the decimal point (since the decimal point would use up the eighth character position). It will also support reading scientific notation so '896.33E2' would mean the number 89,633.
My BI department just ran into the SAS error: this range is repeated, or values overlap.
I found some links they looked at and found that there was an error in a macro.
The error was that the length of a numeric variable byte value was changed from 7 to 6 bytes created this error.
Now when they changed it back to it's previous value everything is ok.
What is this behaviour all about? Are there some logic in this?
When reducing the length of a variable from 7 to 6 bytes, some numbers might get "truncated". 7 bytes can store integers up to 35,184,372,088,832 while 6 bytes can store only integers up to 137,438,953,472. Decimal numbers should always be length 8. See here for details.
I'm building our inventory feed for Amazon Seller Central in OpenOffice Calc but can't work out how to convert our inhouse product IDs to the Amazon required format GCID.
The standard-product-id must have a specific number of characters according to type: GCID (16 alphanumeric characters), UPC (12 digit number), EAN (13 digit number) or GTIN(14 digit number).
Our product IDs vary by manufacturer, eg:-
123456
AB123456
1234AB
Where the ID is numerical only I can format the cells with leading zeros, however this doesn't work if the cell contains letters.
My file has over 10,000 products so I'm wondering if there is a formula I can apply to all cells to instantly convert them to GCID?
It seems the question was asked when under a misapprehension but having noticed that the example 123456 AB123456 1234AB represents three different IDs and aware that padding to a specified length is quite a common requirement (eg see String.PadLeft Method) a suggestion for OpenOffice might be of use to someone, one day.
Convention is to pad with 0s but since some spreadsheets automatically strip these off the front of numbers (as first example) and databases tend to prefer that fields are of consistent format I suggest separating the padding from the example with a hyphen, to aid identification of alpha numeric codes and to force text format:
=REPT(0;15-LEN(A1))&"-"&A1