Include begin{document}, end{document} in LaTeX tables code generated in Stata

Include begin{document}, end{document} in LaTeX tables code generated in Stata - stata

While using estpost/esttab/esttab in Stata to generate LaTeX tables, latex initialization syntax such as documentclass{}, begin{document}, and end{document} are never included. This means that every LaTeX code generated needs for these to be added.
I have many many tables to create. Is it possible to include these through Stata itself?

There are two potential solutions, the first is to include these using the prehead and postfoot options, which allow you to do this directly, but make table formatting a bit more difficult. Or there is the option to simply use include{asdf.tex} in another file.
Solution 1 example:
sysuse auto, clear
reg price mpg
esttab using "temp.tex", ///
prehead("\documentclass{article}\def\sym#1{\ifmmode^{#1}\else\(^{#1}\)\fi}\begin{document}\begin{tabular}{l*{1}{c}}") ///
postfoot("\end{tabular}\end{document}") ///
replace
This will make a basic table, but doing things like including a title become more difficult with this option.
Second solution, in a tex file, you can include any number of tables thusly:
\documentclass{article}
\begin{document}
\input{any_tex_file.tex}
\input{any_tex_file2.tex}
\end{document}
and in this way you can include all of your tables.
Hope this helps

A better solution could be including this command into your stata esttab output commands
booktabs page(column)

Related

I need to programmatically identify all libraries and data files read by several hundred SAS program files. Can this be accomplished programmatically? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 months ago.
Improve this question
Scenario: We have a list of over 200 SAS files. We need to identify all SAS libraries and data sets used as inputs to these programs, and write out a table linking the SAS input data sets to the associated program files. We are not SAS programmers and are just now becoming familiar with the language. The intent is to design a rearchitecture of the logic of the SAS files to be more modular.
We are conducting this analysis statically - i.e., we are not running SAS, we are attempting to extract this data purely from interrogating the code in program files themselves and we do not have access to the data files.
Solution attempted: we have parsed the SAS programs to identify inputs to SAS Procs and SAS Data steps, however there are several challenges. The approach we are using is as follows:
We have obtained a python-based parser (https://github.com/benjamincorcoran/sasdocs) that extracts key information from SAS files. We have applied it to all 200+ files and extracted parsed content into a text file. However, not all SAS syntax is supported; in particular, DataSet blocks are left as unparsed raw text, Procs with a variable number and names of arguments may be missed, and some commands, like various constructs of “set” and “merge” are missed completely by the grammar that has been implemented in the parser so far.
The parser correctly locates about 60% of the files, especially the libraries and files preceded by a "Set" statement. For reasons we do not understand, not all libraries/files preceded by a "SET" command are captured by this parser.
In addition to the "Set" command, we have observed that SAS can also reference a library/file within a Merge or Sort procedure, without a specific Set command.
We are ignoring SAS files from within the 'work' library that are created during processing; we are only concerned with external input files.
Note that we are not running these programs, we only have access to the SAS Program file sources - hence we do not have access to a SAS log.
Questions:
Is there a more direct way do accomplish this goal? Does SAS understand what files it reads and writes, and is there a method of extracting a list of all libraries and files read by SAS associated with a SAS program?
If there is no method of accomplishing this information programmatically, what are all the ways that SAS can access or reference an external library/file, other than within a SET, MERGE or SORT procedure?

SAS has a procedure that does this, PROC SCAPROC. If you do have access to SAS, this is by far the best solution. You would technically need to run SAS, but even if there are errors, in theory it might work okay - the fact that the dataset doesn't exist should be okay, unless your code is data driven.
If you're unable to run the code or run anything in SAS, you'd need to do something with text analysis.
The key things to look for which would catch most of the possibilities would be (in sort of pseudo regex code):
data [lib.]dataset(could have parens but ignore them);
set( [lib.]dataset(ignore parens))* (could have multiple)
merge( [lib.]dataset(ignore parens))* (could have multiple)
update( [lib.]dataset(ignore parens))* (could have multiple)
modify( [lib.]dataset(ignore parens))* (could have multiple)
data=[lib.]dataset(ignore parens) - this is for most PROCs input, could have spaces around the equals sign
out=[lib.]dataset(ignore parens) - this is for most PROCs output, could have spaces around the equals sign
To get more than the "most" above, you'd want to analyze which PROCs were used. Each PROC can have its own output/input options, for example proc surveyselect could use various different datasets for different things, proc format uses CNTLIN and CNTLOUT, etc. You'd also have to see if there are hash tables or other objects used in the code as that has its own elements.
The other thing you could do, only caring about external files, is identify the libname statements. Once you find them, it's possible you could just look for libname.data in the program - that's how all of the datasets in the external folders (libraries) will be referred to. This won't work, though, if you are using metadata-assigned libraries, unless there are a small enough number of them that you could possibly list them all out (and you have access to SAS to find out the list).
Ultimately, your 100% solution is to hire a SAS consultant to look at the code; without being able to run the code (and thus use SCAPROC), there's not really a perfect solution.

Output to Table instead of Graph in Stata

In Stata, is there a way to redirect the data that a command does into a table instead of a graph?
Example: if someone created a normal probability distribution of data with the pnorm var_name command, is there a way to redirect the data so that instead of appearing in a graph, it appears in a table?

To add to #Noobie's answer:
Different commands work in different ways. There's no better short summary.
What you can look out for includes
generate() options that produce new variables. (There is absolute rule that the options have this name, but that or a similar name is the most common single variety.)
Options that allow saving results to new datasets.
Saved results, especially those visible after return list or ereturn list. These can be quite elaborate, e.g. saving of matrices of counts after tabulate.
More broadly, Stata commands aren't functions! One characteristic of a function, as so named in many languages or programs, is that there is a result, with special cases where the result is void or null. There clearly are statistical programs which in broad terms hinge on calling functions which have results, and what you see displayed is often a side-effect of that. Stata commands don't work like that in the sense that the results of a program can be various. In the case of commands designed just to show something, the "result" may be a display. It's worth noting that Mata, which underlies and underpins Stata, is more recognisably a C-like language, with (e.g.) many matrix extensions, which is based on functions (and much else).

Yes and no. It really depends on the command you are using. You should look at the help files first.
For instance, pnorm does not allow that. You can create the data yourself using the formula for pnorm described in the help file, where the cumulative distribution at some point is plotted against the so-called plotting position.
Other Stata commands allow you to generate the points directly. This is the case for kdensity for instance.

Stata Outreg2: file handle __00000G not found

I'm running a long list of regressions in Stata. Results are exported using outreg2. At random points in time, the execution stops at some outreg2 with the error file handle __00000G not found. When I rerun the whole exercise it works after some tries. Do you have any idea what could be the reason?
My code looks as follows where further regressions of the same type follow.
xtreg l_GDP_capita i.year date_intens, fe r
outreg2 using year_DG_FE_gdp, excel append label addtext(DG FE, YES, YEAR FEs, YES) drop(i.year) ctitle(log GDP per capita, AP) replace
xtreg l_GDP_capita i.year date_intens if any_lez==1, fe r
outreg2 using year_DG_FE_gdp, excel append label addtext(DG FE, YES, YEAR FEs, YES) drop(i.year) ctitle(log GDP per capita, LEZ)

Two thoughts:
(1) your file name needs to be in quotations, so outreg2 using "year_DG_FE_gdp", [blah] instead of outreg2 using year_DG_FE_gdp, [blah].
(2) In case you haven't done so, you need to set a directory. cd "whatever filepath you want to save to" (or a global directory if you want to get fancy). You may have done this earlier in your code (I usually do it at the top of a .do file), but since you've only posted the snippet here I can't tell. Forgetting it is one of the classic problems people tend to have with outreg.

Not 100% sure, but I am having the same problem and I think this may arise when outreg2 is writing to a directory that is being synced (e.g. Dropbox, Google Drive). I believe the conflict may come from a permissions conflict where Stata is trying to change the file as the sync software is uploading the change. If you can pause the syncing of the directory while running the Stata commands, this seemed to help me.

use Stata variable labels in results

I have several numeric variables var_1-var_5 with variable labels attached, like "Rain" and "Snow".
Can any mainstream commands use my variable labels when printing to the results window?
For example, it would be really nice if summarize did so. So far, the labels seem to be useful only for graphics and the describe command.

It seems the answer is: if the command can do it, then it will be documented or done by default.
From http://www.stata.com/statalist/archive/2011-09/msg00902.html (2011), I quote Nick Cox:
... if a command has no option to show variable labels, then you can't
show variable labels with that command; it is usually the case that
there is no such option with statistical commands, because typically
there wouldn't be enough space to show variable labels; and if there
is such an option, then it will be documented. The only alternative is
that you learn how to program in Stata and write your own alternative
commands.
Some user-written commands that maybe do what you want (with respect to summarize) are:
ssc describe labsumm
and
ssc describe fsum
These are from http://www.stata.com/statalist/archive/2008-07/msg00850.html.

How do I wrap large matrix-style output in Stata

I would like to wrap large matrix-style output, as per the corr commands wrap option in an ado file. Unfortunately, corr is not implemented as an ado-file, and pwcorr, which is implemented as an ado-file that I might model my approach on is missing the wrap option.
It would be useful to understand both how do do this for a fixed output width, and also useful to know how to base the definition of width for fixed output to the width of current window settings.

I use matlist for displaying most of my matrices, and it will also take care of wrapping.
More complicated output can be handled by being creative when making the matrix. I find the dotz option of matlist often helpful in that respect. Another trick I use a lot is the fact that a column and name can have two parts: an "equation name" which can be shared with multiple rows or columns and a "row or column name within the equation".
If you need more flexibility you can look at frmttable.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js