Additionally, because race is a categorical variable that has three potential values—1 if white, 2 if black, and 3 otherwise—we have to create a dummy variable in order to use this variable. The transformations we use are shown in Figure 3.
The last step before estimating the regressions is to identify the data set as a panel data. shows the two commands that must be entered in order for
Stata to know that
idcode is the individual category and that
year is the time series variable. Figure 4 shows these two commands.
We are now ready to estimate the model (the natural logarithm of wages as a function of various variables). We begin with the random-effects model. Figure 5 shows the command and the results of the estimation of the random-effects model. There are several things to note here. First, in the command we are able to refer to all variables that have age in them by using
age* , the * tells
Stata to use and variable that begins with the letters age. Second, we will need to use the estimation results in the Hausman test. Thus, we have stored these results in “random_effects” using the command
estimates store random_effects .
Notice that three R-squared values are reported in Figure 5. Also, wages reach a peak when the woman is
years old and after 9.795857 years on the job. The interpretation of the other variables demands a bit of algebra. For instance, the fact that
black is a dummy variable affects our interpretation; when an individual is a black, her wage level is:
When she is nonblack, her wage level is
Thus, we have:
or
Thus, the wage level of a black is, everything else held constant, 94.8 percent of the wage level of a nonblack.
If we assume that
grade is a continuous variable (it really is not), we have the following interpretation of the parameter:
implies that
. Thus, in our case a increase of 1 year of schooling causes wages to increase by 6.46 percent.
We can compare the results of using the
re option with using the
mle option (which directs
Stata to use maximum likelihood techniques to estimate the parameters of the system. The mle parameter estimates, shown in Figure 6, are the same as those generated using the
re command. However, the estimates of the standard errors (and, thus, the z-values) are different.
The estimation of the fixed-effects model is straightforward and is shown in Figure 7. The command is the same as in the random-effects model but with the
re replaced by
fe . Notice from the results that the variables
grade and
black are dropped from the estimation results. They are dropped because the amount of schooling and race of an individual is fixed over all observations. These two variables, thus, are perfectly correlated with the dummy variables that hold constant the individual level characteristics. The effects of education and race differences are absorbed into the residual.