How to interpret the constant in an AREG output? - stata

I am running regressions with fixed effects in Stata, using areg, and I just realized its reports a constant term. Allegedly, what areg does is perform a "within" transformation to the data and then run OLS on the transformed data. However, the constant term from the original model is destroyed by the "within" transformation.
Therefore, what does the constant reported by areg mean? Is it a programming mistake? I don't think so, because areg does not allow the -nocons- option, and it would seem that the reason is connected to the meaning of the constant.

With areg (as well as xtreg with the fe option), the intercept is fit so that y-bar minus x-bar times beta-hat is equal to zero. In other words, Stata calculates an intercept that makes the prediction calculated at the means of the independent variables equal to the mean of the dependent variable.

As said in the Stata manual:
The intercept reported by areg deserves some explanation because, given k mutually exclusive and exhaustive dummies, it is arbitrary. areg identifies the model by choosing the intercept that makes the prediction calculated at the means of the independent variables equal to the mean of the dependent variable: y = xb.
areg does give a different constant than using regress.
areg is kind of tricky as it is based on the assumption that the (absorbed) no. of groups will not increase with sample size.

Related

Make continuous variable the baseline when interacting with factor variable (stata)

Consider two variables sex and age, where sex is treated as categorical variable and age is treated as continuous variable. I want to specify the full factorial using the non-interacted age-term as baseline (the default for sex##c.age is to omit one of the sex-categories).
The best I could come up with, so far, is to write out the factorial manually (and omitting `age' from the regression):
reg y sex#c.age i.sex
This is mathematically equivalent to
reg y sex##c.age
but allows me to directly infer the regression coefficients (and standard errors!) on the interaction term sex*age for both genders.
Is there a way to stick to the more economic "##" notation but make the continuous variable the omitted category?
(I know the manual approach has little overhead notation in the example given here, but the overhead becomes huge when working with triple-interaction terms).
Not a complete answer, but a workaround: One can use `lincom' to obtain linear combinations of coefficients and standard errors. E.g., one could obtain the combined effects for both sexes as follows:
reg y sex##c.age
forval i = 1/2 {
lincom age + `i'.sex#c.age
}

Pulp & coin-or-cbc: What are the meaning of SOS weights?

When defining a mixed integer linear programming problem using pulp, one may define sos like so:
x1 = LpVariable('x1', cat = LpInteger)
x2 = LpVariable('x2', cat = LpInteger)
prob.sos1['sos'] = x1 + 2*x2
(an "sos", or specially ordered set, is a special constraint specifying that only a single variable in the set may be nonzero).
We see that this allows specifying weights for the sos variables (1,2 in this case). Presumably they define the precedence of each variable, i.e. which variables to allow to be nonzero first when branching.
But how exactly are the weights defined?
The underlying solver is coin-or-cbc, and I couldn't find anything about how they use SOS weights.
Weights can be used for branching, although not all solvers use them that way. I believe CBC does, but you'll probably need to check the source code to confirm.
Weights in SOS2 are often needed to specify the ordering (SOS2 has the concept of neighbors). SOS1 does not have this issue.
Finally, if you have good bounds, binary variables are often better than SOS1 variables. Solvers do better bounding and generate better cuts when using binary variables. My rule is: if you can formulate a SOS1 structure with binary variables using good big-M values use binary variables. If you cannot find good big-M values, consider SOS1.

stata: inequality constraint in xttobit

Is it possible to constrain parameters in Stata's xttobit to be non-negative? I read a paper where the authors said they did just that, and I am trying to work out how.
I know that you can constrain parameters to be strictly positive by exponentially transforming the variables (e.g. gen x1_e = exp(x1)) and then calling nlcom after estimation (e.g. nlcom exp(_b[x1:_y]) where y is the independent variable. (That may not be exactly right, but I am pretty sure the general idea is correct. Here is a similar question from Statlist re: nlsur).
But what would a non-negative constraint look like? I know that one way to proceed is by transforming the variables, for example squaring them. However, I tried this with the author's data and still found negative estimates from xttobit. Sorry if this is a trivial question, but it has me a little confused.
(Note: this was first posted on CV by mistake. Mea culpa.)
Update: It seems I misunderstand what transformation means. Suppose we want to estimate the following random effects model:
y_{it} = a + b*x_{it} + v_i + e_{it}
where v_i is the individual random effect for i and e_{it} is the idiosyncratic error.
From the first answer, would, say, an exponential transformation to constrain all coefficients to be positive look like:
y_{it} = exp(a) + exp(b)*x_{it} + v_i + e_{it}
?
I think your understanding of constraining parameters by transforming the associated variable is incorrect. You don't transform the variable, but rather you fit your model having reexpressed your model in terms of transformed parameters. For more details, see the FAQ at http://www.stata.com/support/faqs/statistics/regression-with-interval-constraints/, and be prepared to work harder on your problem than you might have expected to, since you will need to replace the use of xttobit with mlexp for the transformed parameterization of the tobit log-likelihood function.
With regard to the difference between non-negative and strictly positive constraints, for continuous parameters all such constraints are effectively non-negative, because (for reasonable parameterization) a strictly positive constraint can be made arbitrarily close to zero.

Formula for PI-regulation Proportional Integral algorithm

I've been reading this website: http://www.csimn.com/CSI_pages/PIDforDummies.html and I'm confused about the proportional integral part. Here's what it says.
Proportional control
Here’s a diagram of the controller when we have enabled only P control:
In Proportional Only mode, the controller simply multiplies the Error by the Proportional Gain (Kp) to get the controller output.
The Proportional Gain is the setting that we tune to get our desired performance from a “P only” controller.
A match made in heaven: The P + I Controller
If we put Proportional and Integral Action together, we get the humble PI controller. The Diagram below shows how the algorithm in a PI controller is calculated.
The tricky thing about Integral Action is that it will really screw up your process unless you know exactly how much Integral action to apply.
A good PID Tuning technique will calculate exactly how much Integral to apply for your specific process - but how is the Integral Action adjusted in the first place?
As you can see, the proportional part is easy to understand it says that you multiply error by tuning variable. The part that I don't get is where you get the P and I from on the second part, and what mathematical operation you do with them. I don't have a degree in mathematics or advanced calculus knowledge, so I would appreciate it if you would try to keep it algebra level.
There is a big part missing from the text, the actual physical system that turns the control into a process and the actual physical variable.
Think of the integral as some kind of averaging operation that filters out small oscillations in the PV input. It also represents some kind of memory of the immediate past of the process.
A moving exponential average, for instance, can be thought of being a mix of integral and proportional action.
Staying with the car driving example, if you come to a curb where you need the steering wheel in a certain position to go in a circle, you don't just yank the wheel to that position, you move it gradually (most of the time). Exactly such ramp-up and -down actions are effects of using the integral action part.
I integral part is just summation also multiplied by some constant.
Analogue integration is done by nonlinear gain and amplifier.
Digital integration of first order is just:
output += input*dt;
second order is:
temp += input*dt;
output += temp*dt;
dt is the duration time of iteration loop (timer or what ever)
do not forget that PI regulator can have more complicated response
i1 += input*dt;
i2 += i1*dt;
i3 += i2*dt;
output = a0*input + a1*i1 + a2*i2 +a3*i3 ...;
where a0 is the P part
Now the I regulator adds more and more amount of control value
until the controlled value is the same as the preset value
the longer it takes to match it the faster it controls
this creates fast oscillations around preset value
in comparison to P with the same gain
but in average the control time is smaller then in just P regulators
therefore the I gain is usually much much smaller which creates the memory and smooth effect LutzL mentioned. (while the regulation time is similar or smaller then just for P regulation)
The controlled device has its own response
this can be represented as differential function
there is a lot of theory in cybernetics about obtaining the right regulator response
to match your process needs as:
quality of control
reaction times
max oscillations amplitude
stability
but for all you need differential math like solving system of differential equations of any order
strongly recommend use of Laplace transform
but many people also use Z transform instead
So I-regulator add speed to regulation
but it also create bigger oscillations
and when not matching the regulated system properly also creates instability
Integration adds overflow risks to regulation (Analog integration is very sensitive to it)
Also take in mind you can also substracting the I part from control value
which will make the exact opposite
sometimes the combination of more I parts are used to match desired regulation response shape

Modelica Evaluation Order

I can't really find any answer in the Modelica specification so ill ask you guys. The specification states that
A tool is free to solve equations, reorder expressions and to not evaluate expressions if their values do not influence the result (e.g. short-circuit evaluation of Boolean expressions). If-statements and if-expressions guarantee that their clauses are only evaluated if the appropriate condition is true, but relational operators generating state or time events will during continuous integration have the value from the most recent event.
If a numeric operation overflows the result is undefined. For literals it is recommended to automatically convert the number to another type with greater precision.
Now, I wonder, can the tool choose to evaluate an expression several time in an integrator step? For example (probably not an valid example, just to give you guys an idea of what I was wondering :) )
Real x;
equation
der(x) = -t;
Modelica.Utilities.Streams.print(String(time));
This will print the same time for several times, so I figured that there is some kind of iteration going on. But I would really like to have it confirmed by some source.
That is normal.
Variable step size solvers (like dassl) can go back
and forth in time to find the direction of the curve.
Also, if you have events more values can be generated
at the same time.
If you want to print time or values just at exact time instants you need when equations:
when sample(0, 1) then
Modelica.Utilities.Streams.print(String(time));
end when;
Read more in the Modelica Spec about sample.
Is also possible to use fixed step size solvers like Euler or so.