I tried to open a .dat file using Stata, and it actually opened, but the data set was a complete mess. I took the file from NBER (CPS data)...
click on the A icon of the year 1964 March.
I tried the regular Stata procedure for .dat files: File->Import->ASKII data created by spreadsheet (delimiter " ") as recommended in Stata manual for .dat files.
But it is still not working. Are there any other ways to open .dat file? Can I convert it to .csv somehow?
(All the data files are ASCII files compressed with the Unix compress command.)
There is a Java app to get you the data from CPS, DataFerrett This app lets you get CPS and other data sets. But it is not very efficient.
I can show you an example how to open one of them yourself (you can use it for any years in the interval 1989 till 2012).
Download the .dat file
Save it in a Desktop folder (C:\Users\Owner...)
Download corresponding .do and .dct files from here
Save them in the same folder
Open the .dat file just the way you open it in your question in Stata
Save it as a Stata .dta file in the same folder (C:\Users\Owner...)
Open the .do file (using Notepad++) that is in your (C:\Users\Owner...) folder
At the very beginning you will see the author presctibes local variables for the paths of .dta, .dat and .dct files. Change the paths so that they point to the saved .dta, .dat and .dct files in your folder (C:\Users\Owner...) on your Desktop
Reopen Stata, and run the .do file from your folder (C:\Users\Owner...)
Done! Save the .dta file
Now, for the years 1962 to 1988, you can do the same procedure (10 steps) as I explained above, but unfortunately NBER does not provide the .do and .dct files. It means that you have to write them yourself. Take one of the available .do and .dct files from any of the years (1989 - 2012) as a benchmark, and write your own .do and .dct files. You will have to make corrections so that the new .do and .dct files are consistent with the corresponding .pdf documentation for each year. I know it is very tideous, but this is the only way you can handle it.
We need more information.
".dat" is not an extension that is special so far as Stata is concerned. Perhaps you meant .dta.
Even if so, what file was it, what command did you use and what was wrong?
The page you linked to leads to numerous files. We have not a hope of guessing which you mean.
Spelling is "Stata".
might not save you from spending days digging into that data but here's some ideas:
the file contains 2 completely different kinds of lines. this might be the reason why you can't import them. you can see this by opening the unzipped file in a text editor. you have to find out what that means.
what do you want to obtain from this file? according to the pdf it contains 85 different values per record. do you need them all? if you're only interested in a few values you could extract them in a unix shell.
Related
I currently have python code running a html scrape and storing the data as CSV files in a folder on my computer called "New Data". I would then like to run my SAS code through every CSV file that is uploaded to that folder. After I run that data through my SAS code I would like to move all of the CSV files from "New Data", to a folder named "Processed Data". I was wondering what SAS code would help me to move CSV files from one folder on my computer to another, after they have been sent through code. Also, the code has to be automated as there will be new CSV files coming in daily.
Thanks!
I have done something similar. Part of data steps is copying files from one folder to another. I used DOS commands in SAS. The command need to be in single quotation. If there is a space in the folder name or file name, the directory of the file needs to be in double quotation.
Here is an example for all the csv files in the "new data" folder to be moved to the "processed data" folder:
data a;
b = system ('move "x:\sas\project\new data\*.csv" "x:\sas\project\processed data\" ');
run;
Pay attention to the quotation. As Reeza mentioned, this code assumes XCMD is enabled.
I'm using Stata 11 on OSX, new to Stata. Someone has sent me a .do file and I want to amend it and run it.
If I open the file in Stata (using File > Open), I see the file appear like this:
I can click on text, but I don't seem to be able to edit or select it. It's almost like an image rather than a text file.
So, um, how do I edit the text?
All of my Googling suggests I need to use the "do-file editor" - I'm not sure if this is it, and if so, how I set it to edit mode!
Open a new .do file (ctrl+9). Then in that go file>open and open the .do file in the editor. That should do the trick.
I am using .txt files in my program for reading and writing records (records contains both text and numerals). Recently i came to know that .dat file also can be used like .txt for file operations. I would like to know the difference between the two and the advantages and disadvantages of one over another.
Text files or .txt files are a bit hard to parse in programs and easy to read. whereas .dat is usually used to store data that is not just plain text.
Generally .txt files contains letters, characters and symbols which is readable.
.dat is binary text file in which data is not always printable on screen.
The extension of a file is a helper so that the operating system (or user) can choose the appropriate program to open it. The actual file contents do not matter. There are some conventions what extensions to use but there is nothing from keeping you to use any arbitrary extension for your files. For instance you can rename a .jar file to .zip-file and be able to open the file with pkunzip.
So for C++ the extension does not matter, but for you as a programmer it may give a hint of the file contents i.e. open it in text or binary mode.
In most languages like C/C++ there is no difference what is your file type in file operations(Read, Write or Edit).
just if you want to work with binary files you should open them in binary format because if you reached \0 in text file it's file end. Dat files are binary too!
If you want to store and read some data, XML file and somtimes DAT files are better because of good libraries to read them. they don't need hard parsing of Text files
Currently I'm just saving the file as MS-DOS CSV with excel. I'm looking for the fastest way (in terms of writing the code) of doing it automatically.
I strongly prefer C++, but any simple executable program I can call from a C++ app would do.
Unzip the xslx file with eg WinZip and have a look at the resulting files. This may help.
I have a folder that contains 300 different files. There are 150 .cft files and 150 .s01 files. Each .cft file has a corresponding .s01 file of the same name. I would like to create a program that can read the files from the folder and place each .cft file and its corresponding .s01 file into an excel document. I would like the .cft file to be on the first worksheet in the document and the .s01 file to be on the second sheet. Then I would like the program to save the file and name it (---------).xls. The (---------) would be the name of the .cft and .s01 file since they are both the same.
So!!! I wrote a program that is able to take the .cft file and the .s01 file, append them and place them in a user defined .xls document. However...I don't want to manually get the names of the 150 files and have to type each one into the program. I also don't want the files to be placed on the same worksheet.
So!!!! I don't want to waste time trying to code something impossible, so before I spend anymore time on this I have a few questions:
Is it possible to read all of the files in a folder and match files of the same name but with different types?
If this is possible, is it then possible to place the corresponding .cft file and .s01 file in the same excel document but on different worksheets?
Then, is it possible to create and save this worksheet as (---------).xls, (-------) being the name of the matching .cft and .s01 file?
So basically...I want to write this code because I am lazy and I don't want to do anything manually ><;;; lol
Example:
The main folder contains 8 files:
dog.cft dog.s01 cat.cft cat.s01 tree.cft tree.s01 bird.cft bird.s01
The program reads all of the files in the folder and recognizes that dog.cft and dog.s01 go together.
The program then creates an excel document and on worksheet 1 places dog.cft and on worksheet 2 places dog.s01.
The program then saves the excel document as dog.xls
Then the program loops through the main folder repeating this process for each of the .cft and .s01 pairs until all 150 pairs have been separated and saved in their own excel document.
I don't know if I'm dreaming a little too big with this but any advice is much appreciated!
personally I would do this with a macro in excel rather than in c++ because doing excel related functions is much easier that way. All of the requirements are possible using VBA within excel.
Yes, it's possible.
For the listing of files in a folder, you can use the Windows API functions FindFirstFile and FindNextFile. When you finish iterating the folder, you'll need to call FindClose.
For creating the Excel spreadsheet and working with the workbook's sheets, you can use COM automation. Here's a link to an article on doing so from C++ (MFC); the article explains where to find one that isn't MFC based.
If you get started and have specific questions about either of the tasks, please post them as separate questions. This should have been two individual questions, in fact - one about iterating the content of a folder and a different one about working with Excel files from C++.