I know there is, in Stata, a command called mkspline that generates cubic spline function. But I want to replicate my Stata output using other software, so I need to learn how to create these spline function variables.
If, say, I use a syntax like this :
mkspline age3sp = age, cubic knots(-13 -7 0 8 16)
How would you do it by hand ?
I would read the Stata documentation for mkspline (see http://www.stata.com/manuals14/rmkspline.pdf for an online copy of the version included with Stata 14) and follow the guidance in the Methods and formulas section.
Related
I am using National Survey Data (specifically, UK LCF) for Regression Analysis that contains a variable weighta described as following:
I really don't know which of the following Stata specifications for weighting is suitable to apply to my variable weighta. Can you please help me?
Here you can see how my the values for my weighta variable look like; they are not integer so this makes me doubt fweight specification in Stata would fit, but I am still not sure about any of the others.
weightq=quarterly weights weighta=annual weights
I am trying to create a comparison group of observations using propensity score matching. There are some characteristics that I care more about matching on than others. My questions are:
Is it possible to adjust the relative weights of variables I'm matching on when constructing the propensity score?
If so, how would one do this in Stata (with the psmatch2 command, for example)?
Thanks!
Take a look at coarsened exact matching. There's a user-written command called cem that implements this in Stata and several other packages.
However, this is not equivalent to PSM, where the scores are estimated rather than imposed by the analyst.
I am using BUGS software through R for doing bayesian analysis and i utilize ggmcmc package for bayesian inference.
In my recent example i have a whole matrix b of parameters under monitor, with dimensions 5x8. Now if i use straight ahead a plot from the ggmcmc package, the parameters are so many that i cant see a thing in the output posterior plot.
e.g. ggs_histogram
Now plot functions in ggmcmc have a parameter called family and you use this to select a subset of parameters to include in the plot. In the official package page it says that you have to set family equal to a regular expression that matches the parameters you want and its quite easy if you have let's say parameters a,b and you want to plot b(family='b').
Now i want from the b matrix that i mentioned to plot only one column elements , for example b[1,1],b[2,1],b[3,1],...,b[8,1]
So i tried to subset this the usual way ,like family='b[,1]'.
Error in seq.default(mn, mx, by = bw) :
'from' cannot be NA, NaN or infinite
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
Any ideas? Maybe a correct regexp or a ggplot facet_grid dribble?
Eventually, the ggmcmc package official pdf document had all the info i was looking for. I was right about the need of a regular expression and the tutorial of the package was pretty informative about the form the regular expression is expected to have.
So if i wanted let's say to infer about the elements of the first column of the parameters matrix,
family='b\\[.,1\\]'
would do the job just fine. This works on any of the inference functions of ggmcmc package.
I am using Stata 12 and I have to run a Ordered Probit (oprobit) with a panel dataset. I know that "oprobit" command is compatible with cross-section analysis. In the new version of Stata (Stata 13) they have "xtoprobit" command to account for Random Effects Ordered Probit. I need the similar command for Stata 12. I have checked "reoprob" command but when I use it with my panel dataset I have the following error :
"factor variables and time-series operators not allowed"
That means you need to create your own dummy variables instead using the factor variable notation i.dummyvar. Try this:
tab dummyvar, gen(D)
reg y D*
This will creates a set of dummy variables (D1, D2,...) reflecting the observed values of the tabulated variable.
Some of the older user-written commands do not know what to do with the factor variable notation, which is convenient, but fairly new.
You can also explore xi for more complicated tasks.
I have done this many times with Excel and Java... This time I need to do it using Stata because it is more convenient to preserve variables' labels. How can I restructure dataset_1 into dataset_2 below?
I need to transform the following dataset_1:
into dataset_2:
I know one way, which is a little awkward... I mean, I could expand all the observations, then create variable obsNo, and then, rename variables...is there any better way?
Stata is wonderful at this sort of thing, it's a simple reshape. Your data is a little awkward, as the reshape command was designed to work with variables where the common part of the variable name (in your case, Wage) comes first. In the documentation for reshape, "Wage" would be the stub. The part following Wage is required to be numeric. If you first sort your variable names by
rename (raceWhiteWage raceBlackWage raceAsianWage) (Wage1 Wage2 Wage3)
Then you can do:
reshape long Wage, i(state year) j(race)
That should give you the output your are looking for. You will have a column labeled "race", with values of 1 for White, 2 for Black, and 3 for Asian.