[R-lang] Re: Coding

Sat Jun 19 13:16:54 PDT 2010

This is an incorrect contrast specification for successive  
differences! The correct specification depends on the number of  
levels. The simplest way to get the correct difference is to use  
contr.sdif from MASS. In this case:

library(MASS)
contr.sdif(3)

I realize this is very counterintuitive for many people, but see  
Venerables & Ripley (2002, section 6.2 ).

Reinhold Kliegl

On 19.06.2010, at 21:15, Maureen Gillespie wrote:

> Hey Peter, Steve, and r-lang,
>    Not sure if I'm going to be able to answer your questions, but  
> I'll give it a shot.  I think Steve is right in the sense that  
> forward difference coding would get you the answers to the questions  
> about whether one level is significantly different that the previous  
> level (i.e., if D > E1, rather than is D > the mean of E1, F1, E2,  
> F2, which is what Helmert tests).  However, forward difference  
> coding does not produce orthogonal coding variables - i.e., below,  
> the sum of the dot products of AvB and BvC is != 0, same goes for  
> BvC and CvD.  So, there will be some degree of colinearity among  
> coding variables if you use this coding scheme, so that's an issue  
> for getting estimates you can trust.
>
>     AvB    BvC     CvD
> A   .5        0          0
> B  -.5       .5          0
> C   0       -.5          .5
> D   0        0          -.5
>
>
> As for your second question about inferring whether E>F even though  
> your EvF contrast isn't significant... I think that there is a  
> problem doing that because in the first model where you established  
> an order with Helmert coding, you could end up in a situation like  
> Steve alluded to.  In the below hypothetical example, E1vF1E2F2 is  
> likely to be significant  (.8 > .45), as well as F1vE2F2 (.78 > . 
> 29). But this asks nothing about whether E > F, overall.  In this  
> example, it's likely there isn't a difference (.55 > .53).
>
> E1 =  .80
> F1 =  .78
> E2 =  .30
> F2 =  .28
>
>
> Without knowing your data, but following the adage that your model  
> can only answer you what you ask it,  I would say that second model  
> better tests your actual hypotheses --  it's the only one that  
> actually asks the questions you want to know about EvF and 1v2. It  
> also does ask some questions about order: Are ABC different than  
> DE1F1E2F2?  Is D different than the Es and Fs combined?  If the  
> theories/conditions you're testing only need to show those patterns  
> to be supported/refuted, then you should be in good shape. :)
>
>
>
> As it seems that contrast coding questions regularly appear on this  
> mailing list, I've listed references below that I have found useful  
> when learning about these issues.   If anyone knows of any more,  
> please add on!
>
> Cohen,  Cohen, West, Aiken (2002). Applied multiple regression/  
> correlation analysis for the behavioral sciences (Chapters 8 & 9).
>
> Kaufman & Sweet (1974). Contrast coding in least squares regression  
> analysis. American Educational Research Journal, 11, 359-377.
>
> Serlin & Levin (1985). Teaching how to derive directly interpretable  
> coding schemes for multiple regression. Journal of Educational  
> Statistics, 10, 223-238.
>
> Wendorf (2004). Primer on multiple regression coding: Common forms  
> and the additional case of repeated contrasts. Understanding  
> Statistics, 3, 47-57.
>
>
> ~Maureen Gillespie
>
>
>
>
>
> On Fri, Jun 18, 2010 at 5:01 PM, Peter Graff <graff@mit.edu> wrote:
> Dear R-lang,
>
> I have a question regarding the interpretation of different coding  
> schemes in a regression model. My experiment compares 8 different  
> levels of a single independent variable. Additionally, it is  
> possible to conceive of 4 of these 8 levels as a mini-2x2 within the  
> 8 level variable:
>
> Cells: A, B, C, D, E-1, E-2, F-1, F-2
>
> I hypothesize the following ordering (which is substantiated by the  
> overall means in the different conditions):
>
> ABC>D>E1>F1>E2>F2
>
> Additionally, I hypothesize that E>F and 1>2. I have implemented 2  
> different models to test these hypotheses and I would like to hear  
> your take on what the correct interpretation of the results is. In  
> model 1 have have Helmert-coded the hypothesized ordering in terms  
> of the following five contrasts:
>
> ABCvDE1F1E2F2
> DvE1F1E2F2
> E1vF1E2F2
> F1vE2F2
> E2vF2
>
> All Helmert-contrasts are significant. In model 2 I have used the  
> following contrasts instead:
>
> ABCvDE1F1E2F2
> DvE1F1E2F2
> EvF
> 1v2
>
> All contrasts except EvF are significant. Collinearity is minimal in  
> both models (all correlations below |.2|)
>
> Is it fair to say that the ABC>D>E1>F1>E2>F2 has been substantiated  
> by the experiment and thus infer that E>F, even though the EvF  
> contrast is not significant in a differently coded model?
>
> Thank you very much in advance for your help,
>
> Best,
>
> Peter Graff
>
>
>
> -- 
> Maureen Gillespie, MA
>
> Graduate Student
> Northeastern University
> Department of Psychology
> 125 Nightingale Hall
> 360 Huntington Ave.
> Boston, MA 02115
> Office: 617-373-3077
> Cell: 603-397-7127
>
> http://sites.google.com/site/gillespiemaureen/
>

----
Reinhold Kliegl, Dept. of Psychology, University of Potsdam,
Karl-Liebknecht-Strasse 24-25, 14476 Potsdam, Germany
phone: +49 331 977 2868, fax: +49 331 977 2793
http://www.psych.uni-potsdam.de/people/kliegl/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100619/8b27b7db/attachment-0001.html