Statistical Article - INTERACTIONS AMONG DICHOTOMOUS PREDICTORS IN REGRESSION

INTERACTIONS AMONG DICHOTOMOUS PREDICTORS IN REGRESSION

David P. Nichols

August 1995

Here we continue the discussion of parameterization of linear regression models involving categorical predictor variables from Keywords issues 56 and 57. In this article we will deal with the problem of interactions among categorical predictors. We will continue to use the United States data with 1990 murder rate as our dependent variable, and also to suppose that the states might be viewed as a random sample from some theoretical population of interest, in order to motivate attention to test statistics.

We'll deal with the simplest case of interaction, that between two binary or dichotomous predictor variables. DEATHPEN, introduced in issue 56, is a 0-1 variable indicating absence or presence of a death penalty statute in 1989-90. CULTURE is a new variable, again 0-1, indicating absence or presence of a certain level of influence of a set of cultural characteristics thought by some social scientists to be involved in the production of high rates of certain types of violence. Though both predictor variables are measured prior to the dependent variable and are potentially causal contributants, there are undoubtedly a number of other factors left out of our simple model. Thus, the relationships shown here should be taken only as an illustration of basic regression methods and not as a rigorous analysis of the contributants to murder rates.

Figure 1 contains the means, standard deviations and cell counts for the four combinations of the two predictors. We can see that the means for states without the cultural factor are lower than those with it, and that on average the states without the death penalty have lower means. However, we also see that the mean for the CULTURE 0 states is higher for the states without the death penalty. There is thus evidence of a differential impact of DEATHPEN depending on what level of CULTURE one considers. In other words, it would appear that DEATHPEN interacts with CULTURE in it's effects on MURDER90. Formally, an interaction means that the effect of a predictor on the dependent variable is dependent upon the level of the other predictor considered. (Incidentally, the 13 states in the CULTURE 0-DEATHPEN 1 cell do not include any of the states carrying out executions in 1989. Though a 2x3 analysis using the three level STATUS89 variable from issue 57 is perhaps a more theoretically satisfactory one, it results in an empty combination of predictors, which produces issues too complicated to be dealt with in this brief article.)

Figure 1: Descriptive Statistics --------------------------------------------------------------------------- Variable .. MURDER90 FACTOR CODE Mean Std. Dev. N CULTURE 0 DEATHPEN 0 4.840 4.307 10 DEATHPEN 1 4.092 1.506 13 CULTURE 1 DEATHPEN 0 5.300 1.671 4 DEATHPEN 1 9.996 2.796 23 ---------------------------------------------------------------------------

We will first analyze the data using the REGRESSION procedure, entering the two dummy variables CULTURE and DEATHPEN, and a product variable computed by multiplying the two variables. This INTERACT product variable, when entered along with CULTURE and DEATHPEN, represents the interaction of CULTURE and DEATHPEN. The results of the regression are given in Figure 2.

Figure 2: REGRESSION results with dummy coding --------------------------------------------------------------------------- Multiple R .70702 R Square .49988 Adjusted R Square .46726 Standard Error 2.85354 Analysis of Variance DF Sum of Squares Mean Square Regression 3 374.38140 124.79380 Residual 46 374.56280 8.14267 F = 15.32591 Signif F = .0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T CULTURE .460000 1.688175 .059237 .272 .7865 DEATHPEN -.747692 1.200261 -.086742 -.623 .5364 INTERACT 5.443344 1.957121 .700974 2.781 .0078 (Constant) 4.840000 .902367 5.364 .0000 ---------------------------------------------------------------------------

The most important thing to notice here is that the INTERACT variable has a significance level of .0078, indicating that the interaction term should remain in the model. Some people might look at the significance levels for the CULTURE and DEATHPEN "main effects," which are both well above .05, and conclude that we have a situation where this is an interaction but no main effects. This evinces a misunderstanding of the meaning of an interaction. An interaction means that the effects of a variable differ across the levels of another variable. In order for the effects of one variable to differ at different levels of another variable, some of these effects must be nonzero. Logically then, an interaction implies that all involved main effects are present as well.

An important feature of the interaction model is brought out by comparing these results with those from the MANOVA procedure, where we've fitted the same model, but used a somewhat different parameterization. Recall that in our previous analyses involving only main effects, the parameterization did not change the overall main effects F-test (only the constant was affected). We see here that this no longer holds once an interaction is introduced into the model. Figure 3 presents the (edited) MANOVA results using SIMPLE(1) contrasts, which request comparison of the second category of each factor to the first, just as is produced by dummy coding when we fit only main effects.

Figure 3: MANOVA results with SIMPLE(1) contrasts --------------------------------------------------------------------------- Parameter Coeff. Std. Err. t-Value Sig. t Lower -95% CL- Upper CONSTANT 6.05698997 .48928 12.37939 .00000 5.07212 7.04186 CULTURE 3.18167224 .97856 3.25138 .00215 1.21193 5.15141 DEATHPEN 1.97397993 .97856 2.01723 .04953 .00424 3.94372 CULTURE BY 5.44334448 1.95712 2.78130 .00782 1.50386 9.38282 DEATHPEN ---------------------------------------------------------------------------

Again, the important thing to look at is the interaction term, which is identical to that given by REGRESSION. Note that changing the parameterization might change the scaling of the parameter here, but would not change the value of the t-statistic or it's significance. Relationships among individual parameters are more complicated when factors have more than two levels, but the overall F-statistics remain the same regardless of parameterization when the highest order term is being considered. Since all terms here have one degree of freedom, the F-tests test the same thing as the t-tests, and have been omitted to save space.

The next thing we might notice is that according to the MANOVA results, both of the "main effects" are significant, unlike in the REGRESSION results. This is where understanding how the model has been parameterized is crucial. Let's look at the "parameter codings" or basis matrices used in the two analyses to see how we can square the two sets of findings. Figure 4 gives the values of the codings for the dummy approach. The 4x4 matrix on the right is the basis or design matrix (X) used in the linear model. The contrast matrix (C) produced by this coding scheme is given in Figure 5. The relationship between the two matrices can be verified by evaluating the following equation:

-1 C = (X'X) X'

Figure 4: Parameterization using dummy codings --------------------------------------------------------------------------- CULTURE DEATHPEN | CONSTANT CULTURE DEATHPEN INTERACT 0 0 | 1 0 0 0 0 1 | 1 0 1 0 1 0 | 1 1 0 0 1 1 | 1 1 1 1 ---------------------------------------------------------------------------

Reading across the first row of Figure 4, we can see that the CULTURE 0, DEATHPEN 0 states are represented completely by the CONSTANT parameter. This is borne out by the values in the first row of the contrast matrix in Figure 5, where the CONSTANT is seen to be simply the mean of the CULTURE 0, DEATHPEN 0 group, and by noting that the 4.84 value for the CONSTANT in the REGRESSION output is the mean for those states. The second row of the basis shows that the CULTURE 0, DEATHPEN 1 mean is modeled by summing the CONSTANT and DEATHPEN parameters, which implies that the DEATHPEN parameter compares this group to the CULTURE 0, DEATHPEN 0 group. The third line of the contrast matrix shows that the DEATHPEN parameter is indeed comparing these two groups. Again, you should be able to derive the parameter estimate value from the appropriate means (within printed levels of precision). Thus the DEATHPEN effect here is really the simple main effect of DEATHPEN at the 0 level of CULTURE (4.092-4.84=-.748), which according to the significance level printed is quite possibly chance variation. The third row of the basis in Figure 4 and the second row of the contrast matrix in Figure 5 show that the CULTURE parameter is assessing the simple main effect of culture for the DEATHPEN 0 states (5.3-4.84=.46), which is also easily attributable to chance. Finally, the INTERACT parameter estimates the difference between the simple main effects of CULTURE at the two levels of DEATHPEN and vice versa:

5.44 = (9.996-4.092) - (5.3-4.84) = (9.996-5.3) - (4.092-4.84).

Thinking about the interaction parameter in this way illustrates how an interaction implies main effects: interactions are differences among differences, and if all differences are 0, then by definition the differences among the differences must also be 0.

Figure 5: Contrasts estimated by dummy codings --------------------------------------------------------------------------- CULTURE0 CULTURE0 CULTURE1 CULTURE1 DEATHPEN0 DEATHPEN1 DEATHPEN0 DEATHPEN1 CONSTANT 1 0 0 0 CULTURE -1 0 1 0 DEATHPEN -1 1 0 0 INTERACT 1 -1 -1 1 ---------------------------------------------------------------------------

Now let's look at the basis and contrast matrices for the SIMPLE(1) coding used in MANOVA (Figure 6). The last line of Figure 7 shows that the contrast estimated by the interaction parameter is the same as with dummy coding, which fits with our earlier observation that the parameter estimates for the interaction terms were the same in both analyses. As we noted earlier though, none of the other parameters represent the same things. The similarity of the basis and contrast matrices for the SIMPLE contrasts is an artifact of the 2x2 design, and does not hold for designs involving larger numbers of levels. In general, it is difficult to use the basis matrix with SIMPLE contrasts to see what is being estimated.

Figure 6: Parameterization using SIMPLE(1) codings --------------------------------------------------------------------------- CULTURE DEATHPEN | CONSTANT CULTURE DEATHPEN INTERACT 0 0 | 1 -.5 -.5 .25 0 1 | 1 -.5 .5 -.25 1 0 | 1 .5 -.5 -.25 1 1 | 1 .5 .5 .25 ---------------------------------------------------------------------------

The contrast matrix shows us that the CONSTANT parameter is simply estimating the (unweighted) average of the four means. The CULTURE parameter estimates the average of the two CULTURE 1 cells minus the average of the two CULTURE 0 cells, while the DEATHPEN parameter estimates the average of the two DEATHPEN 1 cells minus the average of the two DEATHPEN 0 cells. You should be able to use the cell means to reproduce the parameter estimate values. Note that what are labeled as main effects here are averages of the simple main effects of each factor across the levels of the other factor. Thus the 1.974 coefficient for DEATHPEN is the average of the 9.996-5.3=4.696 difference at level 1 of CULTURE and the 4.092-4.84=-.748 difference at level 0 of CULTURE. Though such "main effects" have become the norm for computer output from ANOVA/linear models procedures, largely due to the simplicity of the hypotheses tested, the danger in taking averages of different effects as representative of the whole is well illustrated by this example. If DEATHPEN were a treatment, CULTURE denoted two types of patients and the dependent variable were a measure of health, for example, we might conclude based on the averaged effect that the treatment is good for everyone, when the results are really telling us that it has no effect or is perhaps harmful for one group of patients.

Figure 7: Contrasts estimated by SIMPLE(1) codings --------------------------------------------------------------------------- CULTURE0 CULTURE0 CULTURE1 CULTURE1 DEATHPEN0 DEATHPEN1 DEATHPEN0 DEATHPEN1 CONSTANT .25 .25 .25 .25 CULTURE -.50 -.50 .50 .50 DEATHPEN -.50 .50 -.50 .50 INTERACT 1.00 -1.00 -1.00 1.00 ---------------------------------------------------------------------------

In addition to using the means to verify the interpretations of the parameter estimates, you should be able to use the basis matrices and parameter estimate values to reproduce the predicted values for each cell for both analyses. If you do this, you will see that the predictions produced in each case are identical. This is true because have fitted exactly the same model in both cases. The choice of parameterization strategy affects only the interpretation of the individual parameters, not the overall model. The most important thing to note here is that the interaction is the only term that is independent of the choice of parameterizations. The presence of an interaction means that there is no single main effect for a factor involved in that interaction, so different choices of parameterization lead to different interpretations for the "main effects."