[R-lang] Re: coding in unbalanced data

Bruno Nicenboim bruno.nicenboim@gmail.com
Sun Sep 12 02:10:22 PDT 2010


Hi, thanks for the advice.
I removed the residuals and used the logRT and the output seems more
reasonable.

When I use contrasts, how do I interpret the sign in coefs in the lmer
output
> contrasts(CNPCC$cond.cod) <-cbind("B-D*"= c(3,-(1),-(1),-(1)),"D0-D1.D2"=
+ c(0,(2),-(1),-(1)),"D1-D2"= c(0,0,(1),-(1)))

The levels of cond are B,D0,D1,D2. If the coef of B-D* is less than 0, does
it
mean that the reaction time is faster for B? Is the comparison made between
the
leftmost condition, the positive condition or the biggest number, and the
rest
of the conditions? or neither of them?

Thanks again

On Sat, Sep 4, 2010 at 12:38 AM, Philip Hofmeister <phofmeister@ucsd.edu>wrote:

> Hi Bruno,
>
> First, unbalanced data in LME models will not lead to fixed effect
> correlations, certainly not with the number of data points that
> reading time studies yield. The ability of LME models to handle
> unbalanced data is one of the things that makes it so great.
>
> >
> >> lmer(logRTresidual~condition.coded * canimacy * corder_of_the_sentence
> >> +cSpillOver1+cSpillOver2+cSpillOver3 + csex +chand+cage+(1|subj) +
> (1|word)
> >> ,data=averb)
>
> Two, until you understand your data, I wouldn't advise using residual
> log reading times. They can be very useful, but coefficients are hard
> to interpret, and it's generally good practice to start with raw or
> length-residualized reading times.
>
> > (c means centered, and I think the names are pretty straightforward). I
> have
> > a fixed effect "condition" with 4 levels: B, D0, D1,D2. My data was
> balanced
> > until I removed the RTs smaller than 200 and bigger than 1100. But, now I
> > see it isn't balanced:
> >
> >> prop.table(table(CNPCC$condition))
> >
> >         B        D0        D1        D2
> >
> > 0.1496947 0.2280904 0.2889916 0.3332233
> >
> > I want to use contrast coding (or Helmert coding or orthogonal coding, is
> it
> > the same?). I want to check B vs D0,D1 and D2; D0 vs D1 and D2; and D1 vs
> > D2.
> >
>
> What you describe is Helmert coding. Did you actually change the
> coding of your condition levels? Once factors have been Helmert coded,
> correlations tend to disappear.
>
> > When I did the same kind of experiment with judgments (and balanced data)
> I
> > used this:
> >
> >>CNPCC$cond.co <-CNPCC$condition
> >
> >>contrasts(CNPCC$cond.cod) <-cbind("B-D0"= c(3,-(1),-(1),-(1)),"D0-D1.D2"=
> >> c(0,(2),-(1),-(1)),"D1-D2"= c(0,0,(1),-(1)))
> >
> > But now I'm seeing larger correlations of fixed effects and I suspect
> that
> > it may be because of the coding (everything else is centered). How do I
> do
> > this kind of coding to unbalanced data?
>
> You didn't specify which factors are correlated so it's hard to say
> exactly what's going on. You can look here for some general coding
> tips in R (and also just use the R help library):
>
> http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm
>
> I vaguely recall some reference to the above UCLA page in a previous
> posting to this list, so you might wanna search the archives of this
> list, too.
>
> Good luck,
>
> Philip
>
> >
> > Thanks!
> >
> > Bruno Nicenboim
> >
>



-- 
Bruno
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100912/cd4cba6b/attachment.html 


More information about the ling-r-lang-L mailing list