[R-lang] coding in unbalanced data

Fri Sep 3 09:45:36 PDT 2010

Dear list members,

I'm analyzing the  data of a self paced reading experiment using lmer
(lme4). I followed the guidelines of http://hlplab.wordpress.com, and
Baayen 2008, 2010. And my model looks like this:

> lmer(logRTresidual~condition.coded * canimacy * corder_of_the_sentence +cSpillOver1+cSpillOver2+cSpillOver3 + csex +chand+cage+(1|subj) + (1|word) ,data=averb)

(c means centered, and I think the names are pretty straightforward).
I have a fixed effect "condition" with 4 levels: B, D0, D1,D2. My data
was balanced until I removed the RTs smaller than 200 and bigger than
1100. But, now I see it isn't balanced:

> prop.table(table(CNPCC$condition))

        B        D0        D1        D2

0.1496947 0.2280904 0.2889916 0.3332233

I want to use contrast coding (or Helmert coding or orthogonal coding, is it
the same?). I want to check B vs D0,D1 and D2; D0 vs D1 and D2; and D1 vs
D2.

When I did the same kind of experiment with judgments (and balanced
data) I used this:

>CNPCC$cond.co <-CNPCC$condition

>contrasts(CNPCC$cond.cod) <-cbind("B-D0"= c(3,-(1),-(1),-(1)),"D0-D1.D2"= c(0,(2),-(1),-(1)),"D1-D2"= c(0,0,(1),-(1)))

But now I'm seeing larger correlations of fixed effects and I suspect
that it may be because of the coding (everything else is centered).
How do I do this kind of coding to unbalanced data?

Thanks!

Bruno Nicenboim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100903/82009ce0/attachment.html