The interpretation of the numbers in Table 5 is straightforward. Consider individual 1. The z-value predicted for this individual is -0.68. Using the standard normal tables reported in Table 11 it is easy to see:
The difference between this number and the value reported for
phat in Table 5 is due to rounding error.
A little later we will want to calculate the inverse Mills ratio. As noted in (8), the formula for the inverse Mills ratio is:
The variable phat is equal to
Stata offers an easy way to calculate
with the function “normden(zbhat)” as follows:
.generate imratio = normden(zbhat)/phat
Table 6 repeats Table 5 with the estimate of the inverse Mills ratio for the first 10 observations.
Calculation of the inverse mills ratio for the first 10 observations.
Observation |
zbhat |
phat |
Inverse Mills Ratio |
1 |
-0.6889973 |
0.2454125 |
1.2821240 |
2 |
-0.2029016 |
0.4196060 |
0.9313837 |
3 |
-0.4806706 |
0.3153753 |
1.1269680 |
4 |
-0.1681804 |
0.4332207 |
0.9079438 |
5 |
0.3485867 |
0.6363002 |
0.5900134 |
6 |
0.5875849 |
0.7215945 |
0.4652062 |
7 |
0.9735670 |
0.8348642 |
0.2974918 |
8 |
0.4597758 |
0.6771615 |
0.5300468 |
9 |
0.0179909 |
0.5071769 |
0.7864666 |
10 |
0.3262833 |
0.6278950 |
0.6024283 |
The two heckman estimates
One of the great advantages of using an econometrics program like
Stata is that the authors quite often have created a command that does all of the work for the user. In our case, the commands we need to run to generate the maximum likelihood estimate of the Heckman model are:
. global wage_eqn wage educ age
. global seleqn married children age education
. heckman $wage_eqn, select($seleqn)
Notice that we have used the global command to create a shortcut for referring to each of the two equations in the estimation. The command for the Heckman two-stage estimate is:
.heckman $wage_eqn, select($seleqn) twostage
.predict mymills, mills
Comparison of heckman maximum-likelihood and the heckman two-step estimates with the probit estimates of the selection equation.
(1) Explanatory
variable |
(2) Maximum
likelihood estimate |
(3) Heckman two-step |
(4) Probit estimate of the selection equation |
Wage Equation |
|
|
|
Education |
0.9899537 |
0.9825259 |
— |
|
(18.59) |
(18.23) |
|
Age |
0.2131294 |
0.2118695 |
— |
|
(10.34) |
(9.61) |
|
Intercept |
0.4857752 |
0.7340391 |
— |
|
(0.45) |
(0.59) |
|
Selection equation |
|
|
|
Married |
0.4451721 |
0.4308575 |
0.4308575 |
|
(6.61) |
(5.81) |
(5.81) |
Children |
0.4387068 |
0.4473249 |
0.4473249 |
|
(15.79) |
(15.56) |
(15.56) |
Age |
0.0365098 |
0.0347211 |
0.0347211 |
|
(8.79) |
(8.21) |
(8.21) |
Education |
0.0557318 |
0.0583645 |
0.0583645 |
|
(5.19) |
(5.32) |
(5.32) |
Intercept |
-2.491015 |
-2.467365 |
-2.467365 |
|
(-13.16) |
(-12.81) |
(-12.81) |
|
0.7035061 |
0.67284 |
— |
|
6.004797 |
5.9473529 |
— |
|
4.224412 |
4.001615 |
— |
|
|
(6.60) |
|
Observations |
2000 |
2000 |
2000 |
Number of women not working |
657 |
657 |
657 |
Number of women working |
1343 |
1343 |
1343 |
Log likelihood |
-5178.304 |
— |
-1027.0616 |
|
508.44 |
— |
— |
|
0.0000 |
— |
— |
|
— |
551.37 |
— |
|
— |
0.0000 |
— |
LR test of independent equations (ρ = 0) |
|
|
|
|
61.20 |
— |
478.32 |
|
0.0000 |
— |
0.0000 |
The second command reports the estimates of the inverse Mills ratio; we have retrieved these values in order to check our earlier calculations. Table 7 reports the results of these two estimations. Column 2 reports the maximum-likelihood estimates; Column 3 reports the Heckman two-step estimates; and Column 3 reports the probit estimate of selection equation as reported in Table 4. The estimates for the two methods are very similar. Of course, the probit estimates in Column 4 exactly match the results reported for the selection equation in Column 3. As a final check, Table 8 reports the values of the inverse Mills ratio reported in Table 6 with the values of the inverse Mills ratio calculated in the Heckman two-step method. The two estimates are identical except for some rounding errors.