I am trying to convert a sas 9 file to stata 14 using DDMMYYp10. inside my sas file for dates.
Stattransfer13 is transferring my date variables as intiger and not in a date format to Stata 14.
I tried to use the date-fmt-write options inside stattransfer 13, without success.
This is Stata solution, rather than a StatTransfer one.
SAS stores dates as integers, with a zero of January 1, 1960. mmddyyp10. is just a format that puts a period between components, so 18031 looks like 05.14.2009 after FORMAT datevar mmddyyp10.; (if I recall the SAS syntax correctly).
Conveniently, Stata uses the same coding convention for dates. You just need to format datevar %tdNN.DD.CCYY in Stata:
. display %tdNN.DD.CCYY 18031
05.14.2009
Related
I'm very new to SAS and I'm trying to read a txt file that contains date and time. The file is shown in the following figure
I believe I have tried all the possible options that I can think of to read the file but the output is still in numeric form. The following is the code I'm using
data wb_bg_1619;
infile "C:\Users\daizh\Desktop\Ren\SAS\wb_bg_0215.txt" firstobs=3 missover;
informat DATE DATE7. TIME TIME5. ;
input DATE TIME BG;
run;
proc print data=wb_bg_1619;
run;
The output looks like this
You've used an informat to automatically convert a date stored as text into a numeric SAS date format, which is the number of days since Jan 1 1960. To display that in a human readable format, you need to use a regular format. Add the following to the top or bottom of your code:
format date date9.
time time.
;
This changes how the data is displayed to you, but does not change how SAS works with it. As far as SAS is concerned, a date is only a number. You could run the rest of your program without ever using a format and get the right numbers and calculations with it if you wanted to, but that sure makes troubleshooting hard.
To remember the difference between a format and informat:
informats are for inputs
formats are for you
I've got a temporary work table with a date variable source_datetime in SAS DIS. This variable is in the DATETIME22.6 format.
I have a teradata table with a date field target_date (type DATE), and using a table loader I am attempting to map source_datetime to target_date. When I run the transformation I get the error
ERROR: A SAS value cannot be converted to a Teradata date
The temporary work table is populated with good data. When I attempt the conversion from DATETIME22.6 to DATE9. the output looks like "*********".
Much gratitude.
I know very little about either DIS or Teradata, but I don't think either are related to your problem.
Datetime values are the number of seconds since Jan 1, 1960 00:00:00. Date values are the number of days since Jan 1, 1960.
It sounds like you are trying to apply the date9 format to a datetime value. If you do this, it will usually look like ********* because the number of seconds is way too high to be represented as a date. If you want to keep the datetime value but have it formatted like a date, use the dtdate9 format. Otherwise, you could convert the datetime value to a date value with the datepart() function and then use the date9 format.
I have this number 90724 which is actually 24/07/2009, how can output the number to that date format. Another example 100821 should 21/08/2010
data want
set testData;
format date ddmmyyyy.;
run;
Cheers
You really need to learn about Informats. Another good introductory source is this UCLA site
What you really needed to specify is the format in which you have the dates - using the informat yymmdd6. It uses a YearCutOff option to determine which century a 2-digit year falls into see Adjusting Dates in a New Century & YEARCUTOFF= System Option
Note: The default is 1920 which spans the 100-year range between 1920 and 2019 - if your dates are outside this range then set the appropritate cutoff value using OPTIONS YEARCUTOFF=nnnn;
data test;
dateNumber=100821; ProperDate=input(put(dateNumber,6.), yymmdd6.); output;/*ProperDate= 21AUG2010*/
dateNumber=90724; ProperDate=input(put(dateNumber,6.) , yymmdd6.); output;/*ProperDate=24JUL2009*/
format ProperDate date9.;
run;
I'm using Stata 12.0.
I have a CSV file of exposures for days of the year e.g. 01/11/2002 (DMY).
I want these imported into Stata and it to recognise that it is a date variable. I've been using:
insheet using "FILENAME", comma
But by doing this I am only getting the dates as labels rather than names of the variables. I guess this is because Stata doesn't allow variable names to start with numbers. I have tried to reformat the cells as Dates in Excel and import but then Stata thinks the whole column is a Date and changes the exposure data into dates.
Any advice on the best course of action is appreciated...
As commented elsewhere, I too think you probably have a dataset that is best formatted as panel data. However, I address first the specific problem I think you have according to your question. Then I show some code in case you are interested in switching to a panel structure.
Here is an example CSV file open as a spreadsheet:
And here the same file, open in a text editor. Imagine the ; are ,. This is related to my system's language settings.
Running this (substitute delimiter(";") for comma, in your case):
clear all
set more off
insheet using "D:\xlsdates.csv", delimiter(";")
results in
which I think is the problem you describe: dates as variable labels. You would like to have the dates as variable names. One solution is to use a loop and strtoname() to rename the variables based on the variable labels. The following goes after importing with insheet:
foreach var of varlist * {
local j = "`: variable l `var''"
local newname = strtoname("`j'", 1)
rename `var' `newname'
}
The result is
The function strtoname() will substitute out the ilegal characters for _'s. See help strtoname.
Now, if you want to work with a panel structure, one way would be:
clear all
set more off
insheet using "D:\xlsdates.csv", delimiter(";")
* Rename variables
foreach var of varlist * {
local j = "`: variable l `var''"
local newname = strtoname("`j'", 1)
rename `var' `newname'
}
* Generate ID
generate id = _n
* Change to long format
reshape long _, i(id) j(dat) string
* Sensible name
rename _ metric
* Generate new date variable
gen dat2 = date(dat,"DMY", 2050)
format dat2 %d
list, sepby(id)
As you can see, there's no need to do anything beforehand in Excel or in an editor. Stata seems to be enough in this case.
Note: I've reused code from http://www.stata.com/statalist/archive/2008-09/msg01316.html.
A further note on performance: A CSV file with 122 variables or days (columns) and 10,000 observations or subjects (rows) + 1 header row, will produce 1,220,000 observations after the reshape. I have tested this on some old machine with a 1.79 GHz AMD processor and 640 MB RAM and the reshape takes approximately 8 minutes. Stata 12 has a hard-limit of 2,147,483,647 observations (although available RAM determines if you can actually achieve it) and Stata SE of 32,767 variables.
There seems to be some confusion here between the names that variables may have, the values that variables may have and the types that they may have.
Thus, the statement "Stata doesn't allow variables to start with numbers" appears to be a reference to Stata's rules for variable names; if it were true, numeric variables would be impossible.
Stata has no variable (i.e. storage) type that is a date. Strictly, it has no concept of a date variable, but dates may be held as strings or numbers. Dates may be held as strings insofar as any text indicating a date is likely to be a string that Stata can hold. This is flexible, but not especially useful. For almost all useful work, dates need to be converted to integers and then assigned a display format that matches their content to be readable by people. Stata has various conventions here, e.g. that daily dates are held as integers with 0 meaning 1 January 1960.
It seems likely in your case that daily dates are being imported as strings: if so, the function date() (also known as daily()) may be used to convert to an integer date. The example here just uses the minimal default display format for daily dates: friendlier formats exist.
. set obs 1
obs was 0, now 1
. gen sdate = "12/03/12"
. gen ndate = daily(sdate, "DMY", 2050)
. format ndate %td
. l
+----------------------+
| sdate ndate |
|----------------------|
1. | 12/03/12 12mar2012 |
+----------------------+
If your variable names are being misread, as guessed by #ChrisP, you may need to tell us more. A short and concrete example is worth more than a longer verbal description.
I'm completely new to stata. I'm trying to merge 3 different datasets which have dates in them with format (d-mmm-yy). While trying to merge i'm encountering with an error saying
date is str 9 in using data stata
r(106)
I have no clue what this error is about. Need some help. I can provide any additional info if required.
Thanks
This probably means that in some data sets, the date is stored as a number (Stata's format is Unix-like, # of elapsed days since 1 Jan 1960), while in others, it is a string (which is exactly what Stata tells you). You need to convert them all to the same format, e.g. with
generate long n_date = date(date, "DMY", 2050)
See help date() or help date functions.