Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 65055

Confused with factor variable interactions

$
0
0
Hello everyone,

consider the following
Code:
sysuse auto, clear
gen byte heavy = weight > 3000

regress price mpg i.foreign##i.heavy
lincom 1.foreign + 1.foreign#1.heavy

gen byte nonhvyfor = (!heavy & foreign)
gen byte hvydom = (heavy & !foreign)
gen byte hvyfor = (heavy & foreign)

regress price mpg nonhvyfor hvydom hvyfor
lincom hvyfor - hvydom
which produces the following output (I'm skipping the non-important commands)
Code:
. regress price mpg i.foreign##i.heavy

      Source |       SS       df       MS              Number of obs =      74
-------------+------------------------------           F(  4,    69) =    8.91
       Model |   216345984     4  54086495.9           Prob > F      =  0.0000
    Residual |   418719412    69  6068397.28           R-squared     =  0.3407
-------------+------------------------------           Adj R-squared =  0.3024
       Total |   635065396    73  8699525.97           Root MSE      =  2463.4

-------------------------------------------------------------------------------
        price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
          mpg |  -211.8713    68.7862    -3.08   0.003     -349.096    -74.6466
              |
      foreign |
     Foreign  |    1710.16   842.3177     2.03   0.046     29.78266    3390.538
      1.heavy |   1074.217   911.9885     1.18   0.243    -745.1499    2893.585
              |
foreign#heavy |
   Foreign#1  |   3483.295    1985.39     1.75   0.084    -477.4488    7444.039
              |
        _cons |   9508.832   1842.119     5.16   0.000     5833.906    13183.76
-------------------------------------------------------------------------------

. lincom 1.foreign + 1.foreign#1.heavy

 ( 1)  1.foreign + 1.foreign#1.heavy = 0

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   5193.456   1794.605     2.89   0.005     1613.317    8773.594
------------------------------------------------------------------------------

. regress price mpg nonhvyfor hvydom hvyfor

      Source |       SS       df       MS              Number of obs =      74
-------------+------------------------------           F(  4,    69) =    8.91
       Model |   216345984     4  54086495.9           Prob > F      =  0.0000
    Residual |   418719412    69  6068397.28           R-squared     =  0.3407
-------------+------------------------------           Adj R-squared =  0.3024
       Total |   635065396    73  8699525.97           Root MSE      =  2463.4

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -211.8713    68.7862    -3.08   0.003     -349.096    -74.6466
   nonhvyfor |    1710.16   842.3177     2.03   0.046     29.78266    3390.538
      hvydom |   1074.217   911.9885     1.18   0.243    -745.1499    2893.585
      hvyfor |   6267.673   1969.226     3.18   0.002     2339.175    10196.17
       _cons |   9508.832   1842.119     5.16   0.000     5833.906    13183.76
------------------------------------------------------------------------------

. lincom hvyfor - hvydom

 ( 1)  - hvydom + hvyfor = 0

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   5193.456   1794.605     2.89   0.005     1613.317    8773.594
------------------------------------------------------------------------------
There are two binary explanatory variables foreign and heavy, and thus four groups
  1. nonhvdom those cars that are domestic and not heavy (base group)
  2. hvydom those cars that are domestic and heavy
  3. nonhvfor those cars that are foreign and not heavy
  4. hvyfor those cars that are foreign and heavy
I would like to know how much higher is the predicted price of a heavy foreign car than a heavy domestic car. In the first regression I use factor variables and the interaction. Up until now I thought that the estimate of the coefficient on the interaction between the two binary variables gave me this difference (3,483). In the second regression I use three binary variables to represent the three groups. Since the base group are those cars that are not heavy and domestic I create the other three possible group binary variables and include them. The difference I'm looking for should be, then, the coefficient on hvyfor minus the coefficient on hvydom, which is 5,193.5. The lincom estimate after the first regression shows that to get this difference from the first regression I would have to add the effects of being foreign to the effects of being foreign and heavy to get the estimate of the difference in the last regression, but that doesn't seem right does it? What am I missing? Because this is confusing.

Thank you.

Viewing all articles
Browse latest Browse all 65055

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>