[R-lang] Re: comparisons in lmer

Wed Jan 12 09:06:48 PST 2011

Hi Marina,
  While I'm by no means a guru, I think I can give a stab at answering some
of these questions regarding coding.  I've attached a very short example of
how to interpret main effects and interactions with two treatment-coded
variables in a fake data set predicting RTs. There is a NoiseCond (NC)
variable that has 2 levels: silence and noise; and a WordCond variable (WC)
that has 3 levels: real word (word), phonologically legal nonword (legal),
and phonologically illegal nonword (illegal). Even though this example uses
fake data and a linear regression model, the basic ideas should apply. I've
rounded to the nearest ms in the example, so if differences among
differences are off by a ms or so, that's why. This is for a 2X3 design, but
it should give you an idea of what the 2X4 design would look like. You'll
have two extra lines in the output: (1) one representing your extra morpheme
level compared to the baseline morpheme level for your baseline level of the
fluency variable, and (2) the other representing the fluency effect at your
extra morpheme level compared to the fluency effect at the baseline morpheme
level.

So to walk through the R output that's in the pdf (for now assuming t-values
above 2 are significant).  This would tell us that all of the differences
(and differences among differences) are large enough to be significant,
except for the difference between the noise-word and silence-word
conditions. For example, if you wanted to report the last line of the output
you could say something like "The noise condition effect on RTs is
significantly larger for illegal items than word items." The sign of the
coefficient tells you the direction of the effect (e.g., silence-word is
non-significantly faster than noise-word; noise-legal is significantly
slower than noise-word). Since you're doing a logistic model, your
coefficient estimates would be in log odds. Jaeger (2008) gives a nice
overview of how to convert those estimates into something interpretable (if
you want to talk about the size of your effects).

This way of coding doesn't quite get at the simple paired tests you mention
in your first post, but it's pretty close. It would work very nicely if you
had a morpheme level that you expected there to be either a very large or no
fluency effect and treat that as your baseline morpheme level, then compare
the size of the fluency effect at each of the other morpheme levels to that
baseline level.

As for determining if you have an overall interaction (what you'd get from
an ANOVA output), there are at least two ways I've heard of  -  as you
mention, you don't get anything from the lmer() output that tells you this.

Probably the best way to do this is to use model comparison.  So you could
run lmer() models with and without the interaction. And then determine if
the model including the interaction provides a better fit than the model
without the interaction. See below:

model1<-lmer(response~fluency*morphemetype+(1|subject)+(1|item),
data=mydata, family="binomial")

model2<-lmer(response~fluency + morphemetype+(1|subject)+(1|item),
data=mydata, family="binomial")

anova(model1, model2)

If the X2 statistic reported is significant, then model1 (with the
interaction) provides a significantly better fit than model2 (only main
effects), which would indicate that the interaction (as a whole) is
significant. NB: For this comparison, it doesn't matter how you've coded the
variables - this should always be the same.

There is second way I've read about (and I'd be interested in hearing if
others have used this) is using aovlmer.fnc().  This gives you F statistics
(and p-values) associated with the main effects and interaction.  It looks
something like a normal ANOVA output.  I don't know how reliable this is
though... it doesn't aggregate over subjects and items as you'd normally do
in a repeated measures ANOVA, so the dfs are much larger than the ones you'd
have if you did a by-subject or by-items ANOVA, so I'm guessing this
overestimates significance. Again, these values won't differ no matter how
you code your variables.

Marina brings up an interesting question I've wondered about, too.  What if
you DO want to do a bunch of paired tests and your coding scheme isn't going
to get at them all in one go? What is the best way of doing this? Run
multiple models on subsets of the data, then adjust/correct for multiple
comparisons?

~Maureen

On Tue, Jan 11, 2011 at 1:17 PM, Marina Sherkina-Lieber <
marina.cherkina@utoronto.ca> wrote:

> Dear R-lang gurus,
> I have questions about coding and interpreting a mixed logistic regression
> with lmer. My data is binary (correct vs. incorrect response), and I have
> two groups of participants - fluent and non-fluent-  and four types of
> morphemes (i.e. two orthogonal factors, so looks like I shouldn't worry
> about collinearity). I used this formula:
> mymodel=lmer(response~fluency*morphemetype+(1|subject)+(1|item),
> data=mydata, family="binomial")
>
> Again, fluency has two levels, and morpheme type, four levels. Now, I want
> to know whether the two group of participants perform differently. This is
> easy because I have only two groups. But my concern is, I also want to know
> the differences between the different morpheme types within each group of
> participants. How can I get them? With treatment coding, it looks like the
> intercept is the performance of the reference-level group of participants
> (here, fluent) on the reference-level morpheme type, is that right? Then,
> with interaction, this is compared to the performance of the non-fluent
> group of participants on each of the three remaining morpheme types. And two
> of these three contrasts are significant. I'm not sure how to interpret
> that. Is there an interaction or not? Or is the latter question legitimate
> at all? But the problem is, it makes more sense for me to compare either the
> same group of participants on different morpheme types or the different
> groups of participants on the same morpheme types. Without lmer, I would do
> non-parametric equivalents of t-tests (and worry about doing too many tests
> and my significance levels), but is there a better way?
>
> I thought I might need a different way of coding. I looked at contrast
> coding, and it doesn't seem to help either. I don't quite understand how
> comparisons to the grand mean can be useful for me.
>
> In the past, I had a similar data set, and I solved the problem this way:
> after finding and reporting the difference between the two groups of
> participants, I then analyzed each group's data separately (i.e. fluent
> separately from non-fluent). Then I had only one predictor, morpheme type.
> Treatment coding gives me the difference between one morpheme type and each
> of the remaining three. I ran the same model several times (in this case,
> three times), changing the reference level, until I got all the comparisons
> within each group. But is this a legitimate thing to do? (In that data, I
> didn't have any interaction, not sure if it's relevant)
>
> Thanks,
> Marina
>
> --
> Marina Sherkina-Lieber
> Ph.D. candidate
> Dept. of Linguistics
> University of Toronto
>
>
>

-- 
Maureen Gillespie, MA

Graduate Student
Northeastern University
Department of Psychology
125 Nightingale Hall
360 Huntington Ave.
Boston, MA 02115
Office: 617-373-3798
Cell: 603-397-7127

http://sites.google.com/site/gillespiemaureen/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20110112/8169e816/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: TreatmentCoding example.pdf
Type: application/pdf
Size: 139898 bytes
Desc: not available
Url : http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20110112/8169e816/attachment-0001.pdf