[R-lang] Re: 3 Level IV

Reinhold Kliegl kliegl@uni-potsdam.de
Tue Feb 22 13:43:04 PST 2011


I suspect this is an example where the contrast specification does not work as many people expect. Specifically, you actually need to assign the generalized inverse of your contrast coding to the 3-level factor. Suppose the three condition means are  2, 4, and 12. Then you want the intercept to be 18/3 = 6, the first contrast to 12 - 3 = 9, and the second contrast to be 2 - 8= -6.  Here is a script to illustrate this.

> library(MASS)  # for ginv() and fractions()
> 
> F <- factor(rep(1:3, each=2))
> Y <- c(1, 3, 3, 5, 11, 13) 
> tapply(Y, F, mean)
 1  2  3 
 2  4 12 
> 
> cmat <- matrix(c(-1/2, -1/2,    1,             
+ 	            1, -1/2, -1/2), 3, 2)
> (contrasts(F) <- cmat)
     [,1] [,2]
[1,] -0.5  1.0
[2,] -0.5 -0.5
[3,]  1.0 -0.5
> 
> # Coefficients do not match expectations!
> (m1 <- coef(lm(Y ~ F)))
(Intercept)          F1          F2 
   6.000000    5.333333   -1.333333 
> 
> 
> cmat.inv <- fractions(t(ginv(cmat)))
> (contrasts(F) <- cmat.inv)
     [,1] [,2]
[1,]    0  2/3
[2,] -2/3 -2/3
[3,]  2/3    0
> 
> # Now coefficients match expectations!	
> (m2 <- coef(lm(Y ~ F)))
(Intercept)          F1          F2 
          6           9          -6 


Reinhold Kliegl




On 22.02.2011, at 21:33, Peter Graff wrote:

> Dear Florian, dear Roger,
> 
> Thank you very much for your help.
> 
> The design is basically a regular 2x2 with one cell missing due to linguistic impossibility.
> 
>> Level 1: A C
>> Level 2: A D
>> Level 3: B D
> 
> The question is whether 1=3>2. I turns out, that when I code collinearly (AvB, CvD; Coding 1), I get 2 significant effects, which practically cancel each other out.
> 
> CODING 1
> 
> Level 1:	-.25	.5
> Level 2:	-.25	-.25
> Level 3:	.5	-.25
> 
> But when I code orthogonally, I only get one (3>2).
> 
> CODING 2
> 
> Level 1:	-.25	.5
> Level 2:	-.25	-.5
> Level 3:	.5	0
> 
> It is also clear from the data, that the result is 1=3>2. It thus seems like the collinearity is producing a spurious significance, even though the collinear coding more straightforwardly mirrors the theoretical motivation.
> 
> Best,
> 
> Peter
> 
> On Feb 19, 2011, at 7:33 PM, T. Florian Jaeger wrote:
> 
>> Hi Peter,
>> 
>> that depends on your question (the hypothesis that you wish to test). Multi-collinearity is a consideration that should come in AFTER you've decided what question to test. 
>> 
>> HTH,
>> Florian
>> 
>> On Sat, Feb 12, 2011 at 8:35 PM, Peter Graff <graff@mit.edu> wrote:
>> Dear R-Lang,
>> 
>> I have a question about how to code a 3-level IV in a regression.
>> 
>> The 3 levels are motivated by 2 2-level predictors of theoretical interest (A/B and C/D):
>> 
>> Level 1: A C
>> Level 2: A D
>> Level 3: B D
>> 
>> Is it reasonable to code:
>> 
>> A versus B, C versus D, Interaction
>> 
>> Or ist it better to code:
>> 
>> A v B, AC v AD
>> 
>> The first coding reflects the theoretical interest more directly, the second coding has considerably less collinearity.
>> 
>> Thank you very much,
>> 
>> Peter Graff
>> 
> 

----
Reinhold Kliegl, Dept. of Psychology, University of Potsdam,
Karl-Liebknecht-Strasse 24-25, 14476 Potsdam, Germany
phone: +49 331 977 2868, fax: +49 331 977 2793
http://www.psych.uni-potsdam.de/people/kliegl/index-e.html






-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20110222/53452371/attachment.html 


More information about the ling-r-lang-L mailing list