[R-lang] Self Paced Reading experiment using residuals with lmer

Sun Aug 29 04:26:01 PDT 2010

Hi, 
I'm analyzing the results of a SPR experiment.

I saw that in Jaeger's blog (HLP/Jaeger lab blog) and in Jaeger, Fedorenko and 
Gibson's article "Anti-locality in English: Consequences for Theories of 
Sentence Comprehension" in order to analyze the results,  they use a linear 
model that takes as dependent variables the residuals of a model that looks 
roughly like this: (I didn't include the transformations they use)

l <- lmer(RT ~  Wordlenght + positionofword + positionofstimulus +  (1 | 
SUBJ)...

RTresidual <- residuals(l)

(http://hlplab.wordpress.com/2008/01/23/modeling-self-paced-reading-data-
effects-of-word-length-word-position-spill-over-etc/#more-46)

Then, the final linear model looks like this:

l <- lmer(RTresidual ~ CONDITION +
            SPILLOVER_1 + SPILLOVER_2 + SPILLOVER_3 +
            (1 | SUBJ) + (1 | ITEM)

On the other hand, Baayen and Milim in "Analyzing Reaction Times" use a model 
that takes that takes as a dependent variable the RT (instead of residuals), and 
includes the word lenght and the position of the word and line in the same 
model, roughly like:

l <- lmer(RT ~ CONDITION + Wordlenght + positionofword + positionofstimulus +
            SPILLOVER_1 + SPILLOVER_2 + SPILLOVER_3 +
            (1 | SUBJ) + (1 | ITEM)

My questions are:
Is there any advantage or disadvantage that should persuade me to use one 
approach or the other? 
Shouldn't I get similar results? (Because I don't)
And finally, I've noticed that each researcher (not only in these two examples) 
uses different transformations on length, positions and reading times. Is there 
any way to check which transformation is the most appropriate?

Thanks !