Bioprobit - the covariance matrix of the Beta’s - stata

I am comparing weights of credit rating determinants across Moody's and S&P's.
The goal of doing the bioprobit analysis is then to do a test whether the beta coefficients are the same between Moody's and S&P.
I wanna do this based on a Wald test, but I need the covariance matrix of the Beta’s. Could you please help me with the code for Stata how to get the covariance matrix??
Variables entering the model are S&Prat Mrat GDP Inflation Ratio etc
Thanks in advance

Based on #Nick Cox:
Example from Stata data (you need to install bioprobit which is user written command)
sysuse auto
bioprobit headroom foreign price length mpg turn
. bioprobit headroom foreign price length mpg turn
group(forei |
gn) | Freq. Percent Cum.
------------+-----------------------------------
1 | 52 70.27 70.27
2 | 22 29.73 100.00
------------+-----------------------------------
Total | 74 100.00
initial: log likelihood = -148.5818
rescale: log likelihood = -148.5818
rescale eq: log likelihood = -147.44136
Iteration 0: log likelihood = -147.44136
Iteration 1: log likelihood = -147.43958
Iteration 2: log likelihood = -147.43958
Bivariate ordered probit regression Number of obs = 74
Wald chi2(4) = 22.61
Log likelihood = -147.43958 Prob > chi2 = 0.0002
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
headroom |
price | -.0000664 .0000478 -1.39 0.164 -.00016 .0000272
length | .0347597 .013096 2.65 0.008 .009092 .0604274
mpg | -.0118916 .0354387 -0.34 0.737 -.0813502 .0575669
turn | -.0333833 .0554614 -0.60 0.547 -.1420857 .0753191
-------------+----------------------------------------------------------------
foreign |
price | .0003981 .0001485 2.68 0.007 .0001071 .0006892
length | -.0585548 .0284639 -2.06 0.040 -.114343 -.0027666
mpg | -.0306867 .0543826 -0.56 0.573 -.1372745 .0759012
turn | -.3471526 .1321667 -2.63 0.009 -.6061946 -.0881106
-------------+----------------------------------------------------------------
athrho |
_cons | .053797 .3131717 0.17 0.864 -.5600082 .6676022
-------------+----------------------------------------------------------------
/cut11 | 2.72507 2.451108 -2.079014 7.529154
/cut12 | 3.640296 2.445186 -1.152181 8.432772
/cut13 | 4.227321 2.443236 -.561334 9.015975
/cut14 | 4.792874 2.452694 -.0143182 9.600067
/cut15 | 5.586825 2.480339 .7254488 10.4482
/cut16 | 6.381491 2.505192 1.471404 11.29158
/cut17 | 7.145783 2.529663 2.187735 12.10383
/cut21 | -21.05768 6.50279 -33.80292 -8.312449
-------------+----------------------------------------------------------------
rho | .0537452 .3122671 -.5079835 .5834004
------------------------------------------------------------------------------
LR test of indep. eqns. : chi2(1) = 0.03 Prob > chi2 = 0.8636
# results that are in `Stata's memory`
ereturn list
scalars:
e(rc) = 0
e(ll) = -147.4395814769408
e(converged) = 1
e(rank) = 17
e(k) = 17
e(k_eq) = 11
e(k_dv) = 2
e(ic) = 2
e(N) = 74
e(k_eq_model) = 1
e(df_m) = 4
e(chi2) = 22.60944901065799
e(p) = .0001515278365065
e(ll_0) = -147.4543291018424
e(k_aux) = 8
e(chi2_c) = .0294952498030625
e(p_c) = .8636405133599019
macros:
e(chi2_ct) : "LR"
e(depvar) : "headroom foreign"
e(predict) : "bioprobit_p"
e(cmd) : "bioprobit"
e(chi2type) : "Wald"
e(vce) : "oim"
e(opt) : "ml"
e(title) : "Bivariate ordered probit regression"
e(ml_method) : "d2"
e(user) : "bioprobit_d2"
e(crittype) : "log likelihood"
e(technique) : "nr"
e(properties) : "b V"
matrices:
e(b) : 1 x 17
e(V) : 17 x 17
e(gradient) : 1 x 17
e(ilog) : 1 x 20
functions:
e(sample)
#You need to use mat list e(V) to display the variance covariance matrix
mat list e(V)
symmetric e(V)[17,17]
headroom: headroom: headroom: headroom: foreign: foreign: foreign: foreign:
price length mpg turn price length mpg turn
headroom:price 2.280e-09
headroom:length -1.431e-07 .00017151
headroom:mpg 3.991e-07 .00018914 .0012559
headroom:turn 4.426e-07 -.00050302 .00027186 .00307597
foreign:price 1.124e-10 -4.999e-09 2.093e-08 2.079e-08 2.205e-08
foreign:length -5.846e-09 8.021e-06 9.950e-06 -.0000249 -2.087e-06 .00081019
foreign:mpg 1.712e-08 .00001035 .00006387 .00001352 1.254e-06 .0006546 .00295746
foreign:turn 1.145e-08 -.00002418 .00001022 .00015562 -.00001083 -.00028103 -.0001411 .01746805
athrho:_cons 2.360e-07 -.00004531 .0000684 .00005575 -2.010e-06 .00043717 -.00147713 -.00449239
cut11:_cons .0000134 .01507955 .07578798 .03653671 1.039e-06 .00068972 .00401168 .00211706
cut12:_cons .00001374 .01514192 .07570527 .03630636 9.488e-07 .0007133 .00386727 .00165474
cut13:_cons .00001393 .01520261 .07550433 .03603257 9.668e-07 .0007088 .00386171 .00165557
cut14:_cons .00001363 .01539981 .07532214 .03582323 1.042e-06 .00068687 .00392914 .00189195
cut15:_cons .00001264 .01584186 .07541396 .03541453 1.101e-06 .00068091 .0040106 .00209853
cut16:_cons .00001148 .01611862 .07562328 .03535426 1.052e-06 .00069849 .00401805 .00206701
cut17:_cons .00001055 .01602514 .07547739 .03620485 9.866e-07 .00069868 .00399718 .00207143
cut21:_cons 4.412e-07 .00073781 .00377201 .00190456 -.00058242 .13231539 .18778679 .51179829
athrho: cut11: cut12: cut13: cut14: cut15: cut16: cut17:
_cons _cons _cons _cons _cons _cons _cons _cons
athrho:_cons .09807649
cut11:_cons -.0064343 6.0079319
cut12:_cons .00229188 5.9652808 5.9789347
cut13:_cons .00187855 5.9546524 5.9639617 5.9694026
cut14:_cons -.00310632 5.9724552 5.9793328 5.9820512 6.0157096
cut15:_cons -.00783593 6.0300908 6.03522 6.0360956 6.0667389 6.1520838
cut16:_cons -.00756313 6.0745198 6.0789515 6.0788816 6.1081885 6.1880183 6.275988
cut17:_cons -.00673882 6.0811477 6.0851101 6.0844209 6.1128719 6.1897756 6.2679698 6.3991936
cut21:_cons -.13478036 .30582954 .28918756 .28844026 .29527602 .30401845 .30575462 .30503648
cut21:
_cons
cut21:_cons 42.286275
# If you want to use variance covariance matrix of first four variables
mat kk=e(V)
mat kkk=kk[1..4,1..4]
mat list kkk
symmetric kkk[4,4]
headroom: headroom: headroom: headroom:
price length mpg turn
headroom:price 2.280e-09
headroom:length -1.431e-07 .00017151
headroom:mpg 3.991e-07 .00018914 .0012559
headroom:turn 4.426e-07 -.00050302 .00027186 .00307597

Related

Setting base category for margins in Stata

I am calculating marginal effects from a logistic regression in Stata. I need the base of quantilev2 to be set to 2, however the default is set to 3. I have the following code and output.
fvset base 3 quantilev2
logit returnv customer_age i.customer_gender return_ratio_customer total_past_orders avg_ordersize relationship_length total_rela_value price i.pricesegment2 discount_value discount_ratio return_ratio_product i.category2 brand2 i.quantilev2 basket_size diff_prod_cat same_item same_item_diff_size product_number_within_order i.quantilev2#c.basket_size
margins, dydx(quantilev2) at(basket_size=0.205471) at(basket_size=7.3485774) at(basket_size=14.449777) pwcompare
The margins command then gives the following output
Expression: Pr(returnv), predict()
dy/dx wrt: 1.quantilev2 2.quantilev2
1._at: basket_size = .205471
2._at: basket_size = 7.348577
3._at: basket_size = 14.44978
---------------------------------------------------------------
| Contrast Delta-method Unadjusted
| dy/dx std. err. [95% conf. interval]
--------------+------------------------------------------------
1.quantilev2 |
_at |
2 vs 1 | -.0064799 .002717 -.0118052 -.0011546
3 vs 1 | -.0128849 .0053998 -.0234683 -.0023015
3 vs 2 | -.0064049 .0026827 -.011663 -.0011469
--------------+------------------------------------------------
2.quantilev2 |
_at |
2 vs 1 | .0084133 .0035772 .0014022 .0154244
3 vs 1 | .0167195 .0070962 .0028111 .0306279
3 vs 2 | .0083062 .0035191 .0014089 .0152035
--------------+------------------------------------------------
3.quantilev2 | (base outcome)
---------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the
base level.

Select right kernel size for median blur to reduce noise

I am new to image processing. We have a requirement to get circle centers with sub pixel accuracy from an image. I have used median blurring to reduce the noise. A portion of the image is shown below. The steps I followed for getting circle boundaries is given below
Reduced the noise with medianBlur
Applied OTSU thresholding with threshold API
Identified circle boundaries with findContours method.
I get different results when used different kernel size for medianBlur. I selected medianBlur to keep edges. I tried kernel size 3, 5 and 7. Now I am confused to use the right kernel size for medianBlur.
How can I decide the right kernel size?
Is there any scientific approach to decide the right kernel size for medianBlur?
I will give you two suggestions here for how to find the centroids of these disks, you can pick one depending on the level of precision you need.
First of all, using contours is not the best method. Contours depend a lot on which pixels happen to fall within the object on thresholding, noise affects these a lot.
A better method is to find the center of mass (or rather, the first order moments) of the disks. Read Wikipedia to learn more about moments in image analysis. One nice thing about moments is that we can use pixel values as weights, increasing precision.
You can compute the moments of a binary shape from its contours, but you cannot use image intensities in this case. OpenCV has a function cv::moments that computes the moments for the whole image, but I don't know of a function that can do this for each object separately. So instead I'll be using DIPlib for these computations (I'm an author).
Regarding the filtering:
Any well-behaved linear smoothing should not affect the center of mass of the objects, as long as the objects are far enough from the image edge. Being close to the edge will cause the blur to do something different on the side of the object closest to the edge compared to the other sides, introducing a bias.
Any non-linear smoothing filter has the ability to change the center of mass. Please avoid the median filter.
So, I recommend that you use a Gaussian filter, which is the most well-behaved linear smoothing filter.
Method 1: use binary shape's moments:
First I'm going to threshold without any form of blurring.
import diplib as dip
a = dip.ImageRead('/Users/cris/Downloads/Ef8ey.png')
a = a(1) # Use green channel only, simple way to convert to gray scale
_, t = dip.Threshold(a)
b = a<t
m = dip.Label(b)
msr = dip.MeasurementTool.Measure(m, None, ['Center'])
print(msr)
This outputs
| Center |
- | ----------------------- |
| dim0 | dim1 |
| (px) | (px) |
- | ---------- | ---------- |
1 | 18.68 | 9.234 |
2 | 68.00 | 14.26 |
3 | 19.49 | 48.22 |
4 | 59.68 | 52.42 |
We can now apply a smoothing to the input image a and compute again:
a = dip.Gauss(a,2)
_, t = dip.Threshold(a)
b = a<t
m = dip.Label(b)
msr = dip.MeasurementTool.Measure(m, None, ['Center'])
print(msr)
| Center |
- | ----------------------- |
| dim0 | dim1 |
| (px) | (px) |
- | ---------- | ---------- |
1 | 18.82 | 9.177 |
2 | 67.74 | 14.27 |
3 | 19.51 | 47.95 |
4 | 59.89 | 52.39 |
You can see there's some small change in the centroids.
Method 2: use gray scale moments:
Here we use the error function to apply a pseudo-threshold to the image. What this does is set object pixels to 1 and background pixels to 0, but pixels around the edges retain some intermediate value. Some people refer to this as a "fuzzy thresholding". These two images show the normal ("hard") threshold, and the error function clip ("fuzzy threshold"):
By using this fuzzy threshold, we retain more information about the exact (sub-pixel) location of the edges, which we can use when computing the first order moments.
import diplib as dip
a = dip.ImageRead('/Users/cris/Downloads/Ef8ey.png')
a = a(1) # Use green channel only, simple way to convert to gray scale
_, t = dip.Threshold(a)
c = dip.ContrastStretch(-dip.ErfClip(a, t, 30))
m = dip.Label(a<t)
m = dip.GrowRegions(m, None, -2, 2)
msr = dip.MeasurementTool.Measure(m, c, ['Gravity'])
print(msr)
This outputs
| Gravity |
- | ----------------------- |
| dim0 | dim1 |
| (px) | (px) |
- | ---------- | ---------- |
1 | 18.75 | 9.138 |
2 | 67.89 | 14.22 |
3 | 19.50 | 48.02 |
4 | 59.79 | 52.38 |
We can now apply a smoothing to the input image a and compute again:
a = dip.Gauss(a,2)
_, t = dip.Threshold(a)
c = dip.ContrastStretch(-dip.ErfClip(a, t, 30))
m = dip.Label(a<t)
m = dip.GrowRegions(m, None, -2, 2)
msr = dip.MeasurementTool.Measure(m, c, ['Gravity'])
print(msr)
| Gravity |
- | ----------------------- |
| dim0 | dim1 |
| (px) | (px) |
- | ---------- | ---------- |
1 | 18.76 | 9.094 |
2 | 67.87 | 14.19 |
3 | 19.50 | 48.00 |
4 | 59.81 | 52.39 |
You can see the differences are smaller this time, because the measurement is more precise.
In the binary case, the differences in centroids with and without smoothing are:
array([[ 0.14768417, -0.05677508],
[-0.256 , 0.01668085],
[ 0.02071882, -0.27547569],
[ 0.2137167 , -0.03472741]])
In the gray-scale case, the differences are:
array([[ 0.01277204, -0.04444567],
[-0.02842993, -0.0276569 ],
[-0.00023144, -0.01711335],
[ 0.01776011, 0.01123299]])
If the centroid measurement is given in µm rather than px, it is because your image file contains pixel size information. The measurement function will use this to give you real-world measurements (the centroid coordinate is w.r.t. the top-left pixel). If you do not desire this, you can reset the image's pixel size:
a.SetPixelSize(1)
The two methods in C++
This is a translation to C++ of the code above, including a display step to double-check that the thresholding produced the right result:
#include "diplib.h"
#include "dipviewer.h"
#include "diplib/simple_file_io.h"
#include "diplib/linear.h" // for dip::Gauss()
#include "diplib/segmentation.h" // for dip::Threshold()
#include "diplib/regions.h" // for dip::Label()
#include "diplib/measurement.h"
#include "diplib/mapping.h" // for dip::ContrastStretch() and dip::ErfClip()
int main() {
auto a = dip::ImageRead("/Users/cris/Downloads/Ef8ey.png");
a = a[1]; // Use green channel only, simple way to convert to gray scale
dip::Gauss(a, a, {2});
dip::Image b;
double t = dip::Threshold(a, b);
b = a < t; // Or: dip::Invert(b,b);
dip::viewer::Show(a);
dip::viewer::Show(b); // Verify that the segmentation is correct
dip::viewer::Spin();
auto m = dip::Label(b);
dip::MeasurementTool measurementTool;
auto msr = measurementTool.Measure(m, {}, { "Center"});
std::cout << msr << '\n';
auto c = dip::ContrastStretch(-dip::ErfClip(a, t, 30));
dip::GrowRegions(m, {}, m, -2, 2);
msr = measurementTool.Measure(m, c, {"Gravity"});
std::cout << msr << '\n';
// Iterate through the measurement structure:
auto it = msr["Gravity"].FirstObject();
do {
std::cout << "Centroid coordinates = " << it[0] << ", " << it[1] << '\n';
} while(++it);
}

Get a second header with the units of columns

Sometimes in academic texts one wants to present a Table in which every column has units. It is usual that the units are specified below the column names, like this
|Object |Volume | area | Price |
| |$cm^3$ |$cm^2$ | euros |
|:------------|:-------|--------:|---------:|
|A |3 | 43.36| 567.40|
|B |15 | 43.47| 1000.80|
|C |1 | 42.18| 8.81|
|D |7 | 37.92| 4.72|
How could I achieve this for my bookdown documents?
Thank you in advance.
Here is a way using kableExtra:
```{r}
library(kableExtra)
df <- data.frame(Object = LETTERS[1:5],
Volume = round(runif(5, 1, 20)),
area = rnorm(5, 40, 3),
Price = rnorm(5, 700, 200))
colNames <- names(df)
dfUnits <- c("", "$cm^3$", "$cm^2$", "€")
kable(df, col.names = dfUnits,escape = F, align = "c") %>%
add_header_above(header = colNames, line = F, align = "c")
```

Exporting Stata's coefficient vector: Meaning of suffixes in interaction column

I ran a regression in Stata:
reg y I.ind1990#I.year, nocons r
Then I exported the coefficient vector from Stata using
matrix x = e(b)
esttab matrix(x) using "xx.csv", replace plain
and loaded it in Python and pandas using
df = pd.read_csv('xx.csv', skiprows=1, index_col=[0]).T.dropna()
df.index.name = 'interaction'
df = df.reset_index()
ind1990 and year are numeric. But I have some odd values in my csv (year and ind are manually pulled out of interaction):
interaction y1 ind year
0 0b.ind1990#2001b.year 0.000000 0b 2001b
1 0b.ind1990#2002.year 0.320578 0b 2002
2 0b.ind1990#2003.year 0.304471 0b 2003
3 0b.ind1990#2004.year 0.271429 0b 2004
4 0b.ind1990#2005.year 0.295347 0b 2005
I believe that 0b is how Stata translates missing values aka NIU. But I can't make sense of the other non-numeric values.
Here's what I get for years (and there is both b and o as unexpected suffix:
array(['2001b', '2002', '2003', '2004', '2005', '2006', '2007', '2008',
'2009', '2010', '2011', '2012', '2013', '2014', '2015', '2004o',
'2008o', '2012o', '2003o', '2005o', '2006o', '2007o', '2009o',
'2010o', '2011o', '2013o', '2014o', '2015o', '2002o'], dtype=object)
and for ind1990 (where 0b is apparently NIU, but there are also o suffixes that I can't make sense of:
array(['0b', '10', '11', '12', '20', '31', '32', '40', '41', '42', '50',
'60', '100', '101', '102', '110', '111', '112', '120', '121', '122',
'122o', '130', '130o', '132', '140', '141', '142', '150', '151',
'152', '152o', '160', '161', '162', '171', '172', '180', '181',
'182', '190', '191', '192', '200', '201', '201o', '210', '211',
'220', '220o', '221', '221o', '222', '222o', '230', '231', '232',
'241', '242', '250', '251', '252', '261', '262', '270', '271',
'272o', '272'], dtype=object)
What do the b and o suffixes mean at the end of values of the interaction column?
This isn't an answer, but it won't go well as a comment and it may clarify the question.
The example here isn't reproducible without #FooBar's data. Here is another one that (a) Stata users can reproduce and (b) Python users can, I think, import:
. sysuse auto, clear
(1978 Automobile Data)
. regress mpg i.foreign#i.rep78, nocons r
note: 1.foreign#1b.rep78 identifies no observations in the sample
note: 1.foreign#2.rep78 identifies no observations in the sample
Linear regression Number of obs = 69
F(7, 62) = 364.28
Prob > F = 0.0000
R-squared = 0.9291
Root MSE = 6.1992
-------------------------------------------------------------------------------
| Robust
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
--------------+----------------------------------------------------------------
foreign#rep78 |
Domestic#2 | 19.125 1.311239 14.59 0.000 16.50387 21.74613
Domestic#3 | 19 .8139726 23.34 0.000 17.37289 20.62711
Domestic#4 | 18.44444 1.520295 12.13 0.000 15.40542 21.48347
Domestic#5 | 32 1.491914 21.45 0.000 29.01771 34.98229
Foreign#1 | 0 (empty)
Foreign#2 | 0 (empty)
Foreign#3 | 23.33333 1.251522 18.64 0.000 20.83158 25.83509
Foreign#4 | 24.88889 .8995035 27.67 0.000 23.09081 26.68697
Foreign#5 | 26.33333 3.105666 8.48 0.000 20.1252 32.54147
-------------------------------------------------------------------------------
. matrix b = e(b)
. esttab matrix(b) using b.csv, plain
(output written to b.csv)
The file b.csv looks like this:
"","b","","","","","","","","",""
"","0b.foreign#1b.rep78","0b.foreign#2.rep78","0b.foreign#3.rep78","0b.foreign#4.rep78","0b.foreign#5.rep78","1o.foreign#1b.rep78","1o.foreign#2o.rep78","1.foreign#3.rep78","1.foreign#4.rep78","1.foreign#5.rep78"
"y1","0","19.125","19","18.44444","32","0","0","23.33333","24.88889","26.33333"
Stata's notation here is accessible to non-Stata users. See enter link description here
I don't use esttab (a user-written Stata program) or Python (that's ignorance, not prejudice), so I can't comment beyond that.

test with missing standard errors

How can I conduct a hypothesis test in Stata when my predictor perfectly predicts my dependent variable?
I would like to run the same regression over many subsets of my data. For each regression, I would then like to test the hypothesis that beta_1 = 1/2. However, for some subsets, I have perfect collinearity, and Stata is not able to calculate standard errors.
For example, in the below case,
sysuse auto, clear
gen value = 2*foreign*(price<6165)
gen value2 = 2*foreign*(price>6165)
gen id = 1 + (price<6165)
I get the output
. reg foreign value value2 weight length, noconstant
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 4, 70) = .
Model | 22 4 5.5 Prob > F = .
Residual | 0 70 0 R-squared = 1.0000
-------------+------------------------------ Adj R-squared = 1.0000
Total | 22 74 .297297297 Root MSE = 0
------------------------------------------------------------------------------
foreign | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
value | .5 . . . . .
value2 | .5 . . . . .
weight | 3.54e-19 . . . . .
length | -6.31e-18 . . . . .
------------------------------------------------------------------------------
and
. test value = .5
( 1) value = .5
F( 1, 70) = .
Prob > F = .
In the actual data, there is usually more variation. So I can identify the cases where the predictor does a very good job of predicting the DV--but I miss those cases where prediction is perfect. Is there a way to conduct a hypothesis test that catches these cases?
EDIT:
The end goal would be to classify observations within subsets based on the hypothesis test. If I cannot reject the hypothesis at the 95% confidence level, I classify the observation as type 1. Below, both groups would be classified as type 1, though I only want the second group.
gen type = .
for values 1/2 {
quietly: reg foreign value value2 weight length if id = `i', noconstant
test value = .5
replace type = 1 if r(p)>.05
}
There is no way to do this out of the box that I'm aware of. Of course you could program it yourself to get an approximation of the p-value in these cases. The standard error is missing here because the relationship between x and y is perfectly collinear. There is no noise in the model, nothing deviates.
Interestingly enough though, the standard error of the estimate is useless in this case anyway. test performs a Wald test for beta_i = exp against beta_i != exp, not a t-test.
The Wald test uses the variance-covariance matrix from the regression. To see this yourself, refer to the Methods and formulas section here and run the following code:
(also, if you remove the -1 from gen mpg2 = and run, you will see the issue)
sysuse auto, clear
gen mpg2 = mpg * 2.5 - 1
qui reg mpg2 mpg, nocons
* collect matrices to calculate Wald statistic
mat b = e(b) // Vector of Coefficients
mat V = e(V) // Var-Cov matrix
mat R = (1) // for use in Rb-r. This does not == [0,1] because of
the use of the noconstant option in regress
mat r = (2.5) // Value you want to test for equality
mat W = (R*b-r)'*inv(R*V*R')*(R*b-r)
// This is where it breaks for you, because with perfect collinearity, V == 0
reg mpg2 mpg, nocons
test mpg = 2.5
sca F = r(F)
sca list F
mat list W
Now, as #Brendan Cox suggested, you might be able to simply use the missing value returned in r(p) to condition your replace command. Depending on exactly how you are using it. A word of caution on this, however, is that when the relationship between some x and y is such that y = 2x, and you want to test x = 5 vs test x = 2, you will want to be very careful about the interpretation of missing p-values - In both cases they are classified as type == 1, where the test x = 2 command should not result in that outcome.
Another work-around would be to simply set p = 0 in these cases, since the variance estimate will asymptotically approach 0 as the linear relationship becomes near perfect, and thus the Wald statistic will approach infinity (driving p down, all else equal).
A final yet more complicated work-around in this case could be to calculate the F-statistic manually using the formula in the manual, and setting V to some arbitrary, yet infinitesimally small number. I've included code to do this below, but it is quite a bit more involved than simply issuing the test command, and in truth only an approximation of the actual p-value from the F distribution.
clear *
sysuse auto
gen i = ceil(_n/5)
qui sum i
gen mpg2 = mpg * 2 if i <= 5 // Get different estimation results
replace mpg2 = mpg * 10 if i > 5 // over different subsets of data
gen type = .
local N = _N // use for d.f. calculation later
local iMax = r(max) // use to iterate loop
forvalues i = 1/`iMax' {
qui reg mpg2 mpg if i == `i', nocons
mat b`i' = e(b) // collect returned results for Wald stat
mat V`i' = e(V)
sca cov`i' = V`i'[1,1]
mat R`i' = (1)
mat r`i' = (2) // Value you wish to test against
if (cov`i' == 0) { // set V to be very small if Variance = 0 & calculate Wald
mat V`i' = 1.0e-14
}
mat W`i' = (R`i'*b`i'-r`i')'*inv(R`i'*V`i'*R`i'')*(R`i'*b`i'-r`i')
sca W`i' = W`i'[1,1] // collect Wald statistic into scalar
sca p`i' = Ftail(1,`N'-2, W`i') // pull p-value from F dist
if p`i' > .05 {
replace type = 1 if i == `i'
}
}
Also note that this workaround will become slightly more involved if you want to test multiple coefficients.
I'm not sure if I advise these approaches without issuing a word of caution considering you are in a very real sense "making up" variance estimates, but without a variance estimate you wont be able to test the coefficients at all.