[R-lang] Re: Coding

Sat Jun 19 12:15:47 PDT 2010

Hey Peter, Steve, and r-lang,
   Not sure if I'm going to be able to answer your questions, but I'll give
it a shot.  I think Steve is right in the sense that forward difference
coding would get you the answers to the questions about whether one level is
significantly different that the previous level (i.e., if D > E1, rather
than is D > the mean of E1, F1, E2, F2, which is what Helmert tests).
However, forward difference coding does not produce orthogonal coding
variables - i.e., below, the sum of the dot products of AvB and BvC is != 0,
same goes for BvC and CvD.  So, there will be some degree of colinearity
among coding variables if you use this coding scheme, so that's an issue for
getting estimates you can trust.

    AvB    BvC     CvD
A   .5        0          0
B  -.5       .5          0
C   0       -.5          .5
D   0        0          -.5

As for your second question about inferring whether E>F even though your EvF
contrast isn't significant... I think that there is a problem doing that
because in the first model where you established an order with Helmert
coding, you could end up in a situation like Steve alluded to.  In the below
hypothetical example, E1vF1E2F2 is likely to be significant  (.8 > .45), as
well as F1vE2F2 (.78 > .29). But this asks nothing about whether E > F,
overall.  In this example, it's likely there isn't a difference (.55 >
.53).

E1 =  .80
F1 =  .78
E2 =  .30
F2 =  .28

Without knowing your data, but following the adage that your model can only
answer you what you ask it,  I would say that second model better tests your
actual hypotheses --  it's the only one that actually asks the questions you
want to know about EvF and 1v2. It also does ask some questions about order:
Are ABC different than DE1F1E2F2?  Is D different than the Es and Fs
combined?  If the theories/conditions you're testing only need to show those
patterns to be supported/refuted, then you should be in good shape. :)

As it seems that contrast coding questions regularly appear on this mailing
list, I've listed references below that I have found useful when learning
about these issues.   If anyone knows of any more, please add on!

Cohen,  Cohen, West, Aiken (2002). Applied multiple regression/ correlation
analysis for the behavioral sciences (Chapters 8 & 9).

Kaufman & Sweet (1974). Contrast coding in least squares regression
analysis. American Educational Research Journal, 11, 359-377.

Serlin & Levin (1985). Teaching how to derive directly interpretable coding
schemes for multiple regression. Journal of Educational Statistics, 10,
223-238.

Wendorf (2004). Primer on multiple regression coding: Common forms and the
additional case of repeated contrasts. Understanding Statistics, 3, 47-57.

~Maureen Gillespie

On Fri, Jun 18, 2010 at 5:01 PM, Peter Graff <graff@mit.edu> wrote:

> Dear R-lang,
>
> I have a question regarding the interpretation of different coding schemes
> in a regression model. My experiment compares 8 different levels of a single
> independent variable. Additionally, it is possible to conceive of 4 of these
> 8 levels as a mini-2x2 within the 8 level variable:
>
> Cells: A, B, C, D, E-1, E-2, F-1, F-2
>
> I hypothesize the following ordering (which is substantiated by the overall
> means in the different conditions):
>
> ABC>D>E1>F1>E2>F2
>
> Additionally, I hypothesize that E>F and 1>2. I have implemented 2
> different models to test these hypotheses and I would like to hear your take
> on what the correct interpretation of the results is. In model 1 have have
> Helmert-coded the hypothesized ordering in terms of the following five
> contrasts:
>
> ABCvDE1F1E2F2
> DvE1F1E2F2
> E1vF1E2F2
> F1vE2F2
> E2vF2
>
> All Helmert-contrasts are significant. In model 2 I have used the following
> contrasts instead:
>
> ABCvDE1F1E2F2
> DvE1F1E2F2
> EvF
> 1v2
>
> All contrasts except EvF are significant. Collinearity is minimal in both
> models (all correlations below |.2|)
>
> Is it fair to say that the ABC>D>E1>F1>E2>F2 has been substantiated by the
> experiment and thus infer that E>F, even though the EvF contrast is not
> significant in a differently coded model?
>
> Thank you very much in advance for your help,
>
> Best,
>
> Peter Graff
>

-- 
Maureen Gillespie, MA

Graduate Student
Northeastern University
Department of Psychology
125 Nightingale Hall
360 Huntington Ave.
Boston, MA 02115
Office: 617-373-3077
Cell: 603-397-7127

http://sites.google.com/site/gillespiemaureen/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100619/23c9f5e6/attachment.html