[R-lang] comparisons in lmer

Tue Jan 11 10:17:47 PST 2011

Dear R-lang gurus,
I have questions about coding and interpreting a mixed logistic  
regression with lmer. My data is binary (correct vs. incorrect  
response), and I have two groups of participants - fluent and  
non-fluent-  and four types of morphemes (i.e. two orthogonal factors,  
so looks like I shouldn't worry about collinearity). I used this  
formula:
mymodel=lmer(response~fluency*morphemetype+(1|subject)+(1|item),  
data=mydata, family="binomial")

Again, fluency has two levels, and morpheme type, four levels. Now, I  
want to know whether the two group of participants perform  
differently. This is easy because I have only two groups. But my  
concern is, I also want to know the differences between the different  
morpheme types within each group of participants. How can I get them?  
With treatment coding, it looks like the intercept is the performance  
of the reference-level group of participants (here, fluent) on the  
reference-level morpheme type, is that right? Then, with interaction,  
this is compared to the performance of the non-fluent group of  
participants on each of the three remaining morpheme types. And two of  
these three contrasts are significant. I'm not sure how to interpret  
that. Is there an interaction or not? Or is the latter question  
legitimate at all? But the problem is, it makes more sense for me to  
compare either the same group of participants on different morpheme  
types or the different groups of participants on the same morpheme  
types. Without lmer, I would do non-parametric equivalents of t-tests  
(and worry about doing too many tests and my significance levels), but  
is there a better way?

I thought I might need a different way of coding. I looked at contrast  
coding, and it doesn't seem to help either. I don't quite understand  
how comparisons to the grand mean can be useful for me.

In the past, I had a similar data set, and I solved the problem this  
way: after finding and reporting the difference between the two groups  
of participants, I then analyzed each group's data separately (i.e.  
fluent separately from non-fluent). Then I had only one predictor,  
morpheme type. Treatment coding gives me the difference between one  
morpheme type and each of the remaining three. I ran the same model  
several times (in this case, three times), changing the reference  
level, until I got all the comparisons within each group. But is this  
a legitimate thing to do? (In that data, I didn't have any  
interaction, not sure if it's relevant)

Thanks,
Marina

-- 
Marina Sherkina-Lieber
Ph.D. candidate
Dept. of Linguistics
University of Toronto