[R-lang] Re: comparisons in lmer

Mon Jan 17 08:40:34 PST 2011

Thanks a lot, Maureen!Now that I understand how to interpret the  
interaction, I get a lot more information already!
Marina

Quoting Maureen Gillespie <gillespie.maureen@gmail.com>:

> Hi Marina,
>   While I'm by no means a guru, I think I can give a stab at answering some
> of these questions regarding coding.  I've attached a very short example of
> how to interpret main effects and interactions with two treatment-coded
> variables in a fake data set predicting RTs. There is a NoiseCond (NC)
> variable that has 2 levels: silence and noise; and a WordCond variable (WC)
> that has 3 levels: real word (word), phonologically legal nonword (legal),
> and phonologically illegal nonword (illegal). Even though this example uses
> fake data and a linear regression model, the basic ideas should apply. I've
> rounded to the nearest ms in the example, so if differences among
> differences are off by a ms or so, that's why. This is for a 2X3 design, but
> it should give you an idea of what the 2X4 design would look like. You'll
> have two extra lines in the output: (1) one representing your extra morpheme
> level compared to the baseline morpheme level for your baseline level of the
> fluency variable, and (2) the other representing the fluency effect at your
> extra morpheme level compared to the fluency effect at the baseline morpheme
> level.
>
> So to walk through the R output that's in the pdf (for now assuming t-values
> above 2 are significant).  This would tell us that all of the differences
> (and differences among differences) are large enough to be significant,
> except for the difference between the noise-word and silence-word
> conditions. For example, if you wanted to report the last line of the output
> you could say something like "The noise condition effect on RTs is
> significantly larger for illegal items than word items." The sign of the
> coefficient tells you the direction of the effect (e.g., silence-word is
> non-significantly faster than noise-word; noise-legal is significantly
> slower than noise-word). Since you're doing a logistic model, your
> coefficient estimates would be in log odds. Jaeger (2008) gives a nice
> overview of how to convert those estimates into something interpretable (if
> you want to talk about the size of your effects).
>
> This way of coding doesn't quite get at the simple paired tests you mention
> in your first post, but it's pretty close. It would work very nicely if you
> had a morpheme level that you expected there to be either a very large or no
> fluency effect and treat that as your baseline morpheme level, then compare
> the size of the fluency effect at each of the other morpheme levels to that
> baseline level.
>
>
> As for determining if you have an overall interaction (what you'd get from
> an ANOVA output), there are at least two ways I've heard of  -  as you
> mention, you don't get anything from the lmer() output that tells you this.
>
> Probably the best way to do this is to use model comparison.  So you could
> run lmer() models with and without the interaction. And then determine if
> the model including the interaction provides a better fit than the model
> without the interaction. See below:
>
> model1<-lmer(response~fluency*morphemetype+(1|subject)+(1|item),
> data=mydata, family="binomial")
>
> model2<-lmer(response~fluency + morphemetype+(1|subject)+(1|item),
> data=mydata, family="binomial")
>
> anova(model1, model2)
>
> If the X2 statistic reported is significant, then model1 (with the
> interaction) provides a significantly better fit than model2 (only main
> effects), which would indicate that the interaction (as a whole) is
> significant. NB: For this comparison, it doesn't matter how you've coded the
> variables - this should always be the same.
>
> There is second way I've read about (and I'd be interested in hearing if
> others have used this) is using aovlmer.fnc().  This gives you F statistics
> (and p-values) associated with the main effects and interaction.  It looks
> something like a normal ANOVA output.  I don't know how reliable this is
> though... it doesn't aggregate over subjects and items as you'd normally do
> in a repeated measures ANOVA, so the dfs are much larger than the ones you'd
> have if you did a by-subject or by-items ANOVA, so I'm guessing this
> overestimates significance. Again, these values won't differ no matter how
> you code your variables.
>
> Marina brings up an interesting question I've wondered about, too.  What if
> you DO want to do a bunch of paired tests and your coding scheme isn't going
> to get at them all in one go? What is the best way of doing this? Run
> multiple models on subsets of the data, then adjust/correct for multiple
> comparisons?
>
> ~Maureen
>
> On Tue, Jan 11, 2011 at 1:17 PM, Marina Sherkina-Lieber <
> marina.cherkina@utoronto.ca> wrote:
>
>> Dear R-lang gurus,
>> I have questions about coding and interpreting a mixed logistic regression
>> with lmer. My data is binary (correct vs. incorrect response), and I have
>> two groups of participants - fluent and non-fluent-  and four types of
>> morphemes (i.e. two orthogonal factors, so looks like I shouldn't worry
>> about collinearity). I used this formula:
>> mymodel=lmer(response~fluency*morphemetype+(1|subject)+(1|item),
>> data=mydata, family="binomial")
>>
>> Again, fluency has two levels, and morpheme type, four levels. Now, I want
>> to know whether the two group of participants perform differently. This is
>> easy because I have only two groups. But my concern is, I also want to know
>> the differences between the different morpheme types within each group of
>> participants. How can I get them? With treatment coding, it looks like the
>> intercept is the performance of the reference-level group of participants
>> (here, fluent) on the reference-level morpheme type, is that right? Then,
>> with interaction, this is compared to the performance of the non-fluent
>> group of participants on each of the three remaining morpheme types. And two
>> of these three contrasts are significant. I'm not sure how to interpret
>> that. Is there an interaction or not? Or is the latter question legitimate
>> at all? But the problem is, it makes more sense for me to compare either the
>> same group of participants on different morpheme types or the different
>> groups of participants on the same morpheme types. Without lmer, I would do
>> non-parametric equivalents of t-tests (and worry about doing too many tests
>> and my significance levels), but is there a better way?
>>
>> I thought I might need a different way of coding. I looked at contrast
>> coding, and it doesn't seem to help either. I don't quite understand how
>> comparisons to the grand mean can be useful for me.
>>
>> In the past, I had a similar data set, and I solved the problem this way:
>> after finding and reporting the difference between the two groups of
>> participants, I then analyzed each group's data separately (i.e. fluent
>> separately from non-fluent). Then I had only one predictor, morpheme type.
>> Treatment coding gives me the difference between one morpheme type and each
>> of the remaining three. I ran the same model several times (in this case,
>> three times), changing the reference level, until I got all the comparisons
>> within each group. But is this a legitimate thing to do? (In that data, I
>> didn't have any interaction, not sure if it's relevant)
>>
>> Thanks,
>> Marina
>>
>> --
>> Marina Sherkina-Lieber
>> Ph.D. candidate
>> Dept. of Linguistics
>> University of Toronto
>>
>>
>>
>
>
> --
> Maureen Gillespie, MA
>
> Graduate Student
> Northeastern University
> Department of Psychology
> 125 Nightingale Hall
> 360 Huntington Ave.
> Boston, MA 02115
> Office: 617-373-3798
> Cell: 603-397-7127
>
> http://sites.google.com/site/gillespiemaureen/
>

-- 
Marina Sherkina-Lieber
Ph.D. candidate
Dept. of Linguistics
University of Toronto