[R-lang] Re: Coding

Sat Jun 19 13:19:32 PDT 2010

On Fri, Jun 18, 2010 at 2:01 PM, Peter Graff <graff@mit.edu> wrote:
> I have a question regarding the interpretation of different coding schemes
> in a regression model. My experiment compares 8 different levels of a single
> independent variable. Additionally, it is possible to conceive of 4 of these
> 8 levels as a mini-2x2 within the 8 level variable:
>
> Cells: A, B, C, D, E-1, E-2, F-1, F-2
>
> I hypothesize the following ordering (which is substantiated by the overall
> means in the different conditions):
>
> ABC>D>E1>F1>E2>F2
>
> Additionally, I hypothesize that E>F and 1>2. I have implemented 2 different
> models to test these hypotheses and I would like to hear your take on what
> the correct interpretation of the results is. In model 1 have have
> Helmert-coded the hypothesized ordering in terms of the following five
> contrasts:
[...]

It sounds to me that the individual hypotheses you want to test are:
  1) A = B = C  (maybe)
  2) A > D
  3) B > D
  4) C > D
  5) (or maybe the average of A, B, C, pooling them together, > D?)
  6) D > E1
  7) E1 > F1
  8) F1 > E2
  9) E2 > F2
  10) average(E1, E2) > average(F1, F2)
  11) average(E1, F1) > average(E2, F2)
Does that sound right?

Certainly it'd be possible to test all of these by setting up
particular clever coding schemes and then testing whether individual
model coefficients are different from zero, but it'd be a hassle and
you'd need multiple different coding schemes to get all of them.
Fortunately, for ordinary regression models (those with with lm, glm,
mlm, and maybe others), you don't have to -- you can use a "linear
hypothesis test" to test for differences between coefficients
directly. (The result is the exact same F test you'd get by setting up
the clever contrast coding.)

I like the 'lht' function in package 'car' for doing this. The easiest
way to do this is to fit your model without an intercept, like "y ~ 0
+ factor", so that R will use "dummy" encoding and the model
parameters will just be the mean of each group. Then you can write
tests to compare those means. Note that when we want to test whether X
is greater than Y, then as usual we test the null hypothesis of
equality -- but if lht rejects the null, then its output doesn't
actually tell you in which direction it rejected it. You have to look
and see which side was bigger yourself.

# Example setup:
> library(car)
> my.factor <- factor(rep(c("A", "B", "C", "D", "E1", "E2", "F1", "F2"), 3))
> set.seed(0)
> fake.response <- rnorm(length(my.factor))
> model <- lm(fake.response ~ 0 + my.factor)
> model
Call:
lm(formula = fake.response ~ 0 + my.factor)

Coefficients:
 my.factorA   my.factorB   my.factorC   my.factorD  my.factorE1  my.factorE2
    0.50314      0.39550      0.84303     -0.25471     -0.31909     -0.48401
my.factorF1  my.factorF2
   -0.36482      0.03265

Now to run hypothesis tests:
1) test A = B = C: lht(model, c("my.factorA = my.factorB", "my.factorB
= my.factorC"))
Linear hypothesis test

Hypothesis:
my.factorA - my.factorB = 0
my.factorB - my.factorC = 0

Model 1: fake.response ~ 0 + my.factor
Model 2: restricted model

  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1     16 15.717
2     18 16.044 -2  -0.32739 0.1666  0.848

I won't give outputs for the rest, but to give you the idea:
2) test A > D: lht(model, "my.factorA = my.factorD")
3) test B > D: lht(model, "my.factorB = my.factorD")
4) test C > D: lht(model, "my.factorC = my.factorD")
5) test the average of A, B, C > D: lht(model, "my.factorA +
my.factorB + my.factorC = 3*my.factorD")
  (note, if there are a different number of observations of A, B, C,
then there are actually two ways to do this; when you average the A,
B, C coefficients then you might gain a smidgen more power by
weighting them each by the number of observations. That would
correspond to averaging all the observations without regard to whether
they were A or B or C; the above corresponds to averaging all the A
observations, the B observations, and the C observations separately,
and then averaging the averages. This distinction probably doesn't
matter much in practice.)
6) test D > E1: lht(model, "my.factorD = my.factorE1")
7) test E1 > F1: lht(model, "my.factorE1 = my.factorF1")
8) test F1 > E2: lht(model, "my.factorF1 = my.factorE2")
9) test E2 > F2: lht(model, "my.factorE2 = my.factorF2")
10) test average(E1, E2) > average(F1, F2): lht(model, "my.factorE1 +
my.factorE2 = my.factorF1 + my.factorF2")
11) test average(E1, F1) > average(E2, F2): lht(model, "my.factorE1 +
my.factorF1 = my.factorE2 + my.factorF2")

If you're looking for "main effects" of E vs. F, 1 vs. 2, then it's
probably a good idea to also check for the interaction, since "main
effects" are only really interpretable as such if the two factors
combine additively (i.e., there is no interaction). To test for the
interaction:
  lht(model, "my.factorE1 + my.factorF2 = my.factorE2 + my.factorF1")

Note that these tests of the "embedded 2 x 2" model will give slightly
different results than if you fit a standard 2x2 ANOVA to just the
E1/E2/F1/F2 data. Conceptually, the way an ANOVA test work is to first
look at the size of the effect, and then use the residuals to estimate
how much noise there was, and compare the size of the effect to the
level of noise. If you fit the 2x2 model, then you would get the exact
same effects, but now you would use only the residuals from the
E1/E2/F1/F2 to estimate the amount of noise; in the lht-based "ANOVA"
tests above, you use the full data set's residuals to estimate the
level of noise. So they'll get a different estimate. Assuming your
groups *do* have equal levels of noise, this different estimate will
be more accurate (and this is reflected in a higher number of degrees
of freedom in the lht-based F tests), and you'll get a bit more power
than if you fit the 2x2 model.

In general, lht can perform any test whose null hypothesis can be
expressed as a set of linear equations on your model coefficients; all
the classic regression and ANOVA tests are simple special cases of
this.

Hope that helps,
-- Nathaniel