Is there a simple way to export the "underlying" data of a Stata graph in order to reproduce that graph in MS Excel? Imagine you create a ROC curve using roctab y yhat, graph and you want to reproduce that graph in Excel.
I assume that you do not have access to the actual raw data that was used to compile the .gph in the first place, and somehow want to back engineer the .gph file... then, eek, good luck!
If you do however have the access to the data originally used then with new command available in Stata 13, You can use the function putexcel command
A more detailed description of the putexcel command can be found here stata press releasse on exporting tables to excel
The data in the .gph file are stored in the serset format between the and tags. There's no utility I know of that will parse the serset information, but it is very similar to Stata's dta file (v115 and below). I wrote up the basic file format information here. The Python library pandas has code for reading/writing dta files so with those you could probably create your own serset reader/writer.
Related
I am using google benchmark library (https://github.com/google/benchmark/blob/main/docs/user_guide.md) to do time and complexity analysis of analgorithm in C++.
I read that is possible to get an output file in csv or json format and then plot the result. How can I do this?
I tried to follow the istructions on git library read me but it doesn't produce any output file...
My goal is to produce two plots:
how grows the execution time in funcion of the input dimension of the array
how grows the number of comparison in funcion of the input dimension of the array
I'm using python matplotlib.pyplot library to create the plots and in order to do it I think I need google-benchmark's data in a csv file.
I couldn't find that much documentation or discussion about this library on the web.
Could anyone help me pls?
I downloaded daily MODIS DATA LEVEL 3 data for a few months from https://disc.gsfc.nasa.gov/datasets. The filenames are of the form MCD06COSP_M3_MODIS.A2006001.061.2020181145945 but the files do not contain any time dimension. Hence when I use ncecat to concatenate various files, the date information is missing in the resulting file. I want to know how to add the time information in the combined dataset.
Your commands look correct. Good job crafting them. Not sure why it's not working. Possibly the input files are HDF4 format (do they have a .hdf suffix?) and your NCO is not HDF4-enabled. Try to download the files in netCDF3 or netCDF4 format and your commands above should work. If that's not what's wrong, then examine the output files in each step of your procedure and identify which step produces the unintended results and then narrow your question. Good luck.
I would like to export results of cross section dependence tests for 12 panel data sets to a table in order to compare them with similar tests done with different software. Below is the regression and test instruction example from the xtcsd help page (unfortunately the example dataset is not available but a similar example dataset tbl15-1.dta from the xttest2 page is available). The instruction below will help you understand what I'm trying to achieve:
use "http://fmwww.bc.edu/ec-p/data/Greene2000/TBL15-1.dta"
xtset firm year
xtreg i f c,fe
xtcsd, pesaran
To display the test statistic, I can use
return list
How do I acess the p-value for that statistic?
I have found how to export estimation results using the command esttab.
How do I export test results to a file in Stata?
Following #Maarten Buis's comment below on the p-value, here is how I exported test results to a csv file using the low level file access:
file open xtcsdfile using xtcsd.csv, write replace
file write xtcsdfile "pesaran,pvalue" _n
file write xtcsdfile (r(pesaran)) "," (2*(normal(-abs(r(pesaran))))) _n
file close xtcsdfile
The Pesaran statistic will (asymptotically) follow a standard normal distribution if the null-hypothesis is true, so: the p-value is 2*(normal(-abs(r(pesaran))))
I know that I can create a dta file if I have dat file and dictionary dct file. However, I want to know whether the reverse is also possible. In particular, if I have a dta file, is it possible to generate dct file along with dat file (Stata has an export command that allows export as ASCII file but I haven't found a way to generate dct file). StatTransfer does generate dct and dat file, but I was wondering if it is possible without using StatTransfer.
Yes. outfile will create dictionaries as well as export data in ASCII (text) form.
If you want dictionaries and dictionaries alone, you would need to delete the data part.
If you really want two separate files, you would need to split each file produced by outfile.
Either is programmable in Stata, or you could just use your favourite text editor or scripting language.
Dictionaries are in some ways a very good idea, but they are not as important to Stata as they were in early versions.
I'm using pyuno to read an excel spreadsheet (running on linux.) Many cells have formulas referring to addins that are, obviously, not available. However the cell values are what I want.
But when I load and read the sheet, it seems those formulas are being evaluated and thus the values are being overwritten with errors.
I've tried several things, none of which have worked:
set flags AutomaticCalculation=False, MacroExecutionMode=NEVER_EXECUTE in the call to desktop.loadComponentFromURL
call document.enableAutomaticCalculation(False) on the loaded document
Any suggestions?
If formluas aren't a matter, you might circumvent the problem by processing a copy of your spreadsheet in which only the values (not the formulas) are present.
To achieve this quickly, select the whole sheet content, copy, special paste; then remove everything except "value". Save to a new file (make sure you don't overwrite the original file or every formula will be lost!). Your script should then be able to process this file.
This is an ugly solution, as there must be a way to do it programmaticaly.
Calc does not yet support using the cached results after loading the document. Libreoffice Calc does now use cached results for xls documents. The results are also stored in ods but are ignored while loading the document and the formula result is evaluated by compiling and interpreting the saved formula.
There are some plans to add this for ods and xlsx too but there are many ods producers out there writting incorrect results in the file. So till now the only solution is to have a second version of the document only saving the results (or implementing it inside calc).