[R-lang] Re: Random slopes in LME

Sun Feb 20 12:57:55 PST 2011

Hi Florian,

This helps a lot. There seems to be lack of consensus as to how to 
conduct LME analyses. I saw published papers using intercept-only models 
only.

I have a minor question: For a 3-level predictor, can we determine 
whehter it has a sig. main effect and also calculate the pairwise 
comparisons among levels? I used a very clumsy approach. First, I 
applied centering to the fixed predictor so that the summary of the 
best-fit model gives you estimate, SE, z/t as well as p-value. To 
calculate pairwise comparison, I then built the same model with 
uncentered fixed predictors and then using multcomp. I suppose there 
would be a single-step way?

Garry

于 2011-2-20 20:28, T. Florian Jaeger 写道:
> Hi R-lang,
>
> maybe this is a good time to comment on a more general issue that Roger
> also brought up. As more and more people employ mixed models to analyze
> their data, we will need some conventions in terms of the procedures
> employed and the aspects of the analysis that are reported. While that
> is probably unattainable for the most general case, I think it's a
> reasonably achievable goal for standard 2 x 2 (or, by extensions,
> factorial) psycholinguistic experiments with more or less balanced data.
>
> In the meantime, I have some comments on the specific issue of random
> slopes. For a factorial design, I would submit that it should be the
> norm that we test the full factorial random effect structure with cross
> subject and item random effects. The goal of this procedure should be
> what I have called the "maximal random effect structure justified by the
> data" (which, of course, is a bit of a shorthand, since it's really the
> maximal random effect structure justified by the data under a set of
> assumptions, such as that the assumptions of the generalized linear
> mixed model and the particular family of distributions are met to a
> sufficient extent, that there is sufficient data to justify model
> comparison using a chisq over differences in deviances, that there is no
> overfitting issue, etc.).
>
> To see why we should test whether random slopes are required (and
> include them in the mode, if they are required), it might be helpful to
> recall what random effect structure in a linear mixed model would most
> closely correspond to our good old friends, the F1 and F2 (by-subject
> and by-item) ANOVA. For example, the F1 ANOVA is intended to not just
> capture between-subject differences in terms of the intercept (e.g.
> overall reading time differences, if we assume sum-coding). It's also
> meant to account for between-subject differences in the sensitivity to
> /all/ of the design conditions (e.g. the two main effects and the
> interaction of a 2x2 design). So, the linear mixed model that most
> closely corresponds to the F1 ANOVA has the following random effect
> structure:
>
> (1 + Condition1 + Condition2 + Condition1:Condition2 | Participant)
>
> or shorter
>
> (1 + Condition1*Condition2 | Participant)
>
> and, similarly, the closest thing to an F2 analysis would be
>
> (1 + Condition1*Condition2 | Item)
>
> hence, to take maximal advantage of the fact that linear mixed models
> allow us to combine both analyses into one (reducing, among other
> things, the conservativity of minF analyses), the starting model should
> probably have both of these as crossed random effects, plus, obviously,
> the fixed effects for the design:
>
> 1 + Condition1*Condition2 + (1 + Condition1*Condition2 | Participant)
> + (1 + Condition1*Condition2 | Item)
>
> I'll call this the full model (not the same as the model with "the
> maximal random effect structure justified by the data"!). Now, not all
> of the random effects may be justified, as we can test by model
> comparison. For example, the model might not even converge. So, we can
> simplify the model stepwise, starting with the higher order random
> effect terms (the highest interaction). I am assuming that we will leave
> the fixed effect structure untouched since it's directly justified by
> the theoretical interests that led us to create the 2x2 design.
>
> As we simplify the random effect structure stepwise, there are some tips
> that I find useful (see also
> http://hlplab.wordpress.com/2009/05/14/random-effect-structure/ for a
> simple recipe):
>
> 1) Let's start by comparing the full model with a model with just the
> random intercepts:
>
> 1 + Condition1*Condition2 + (1 | Participant) + (1 | Item)
>
> Let's call this the intercept-only model.  I'll assume that we have
> enough data to justify model comparison using a chisq over differences
> in deviances (that will usually be the case for a reasonably sizes
> psycholinguistic experiment). I'll also assume ML-fitted models. If the
> difference in deviance between the full model and intercept-only model
> is small, i.e. if the chisq of the model comparison has a value of less
> than 3 (or 2, if you want to be conservative; if I recall correctly even
> a chisq(1)< 3.8 is technically no significant), we don't really need to
> look further. There is no way then that any of the intermediate models
> could be significant. That's because difference in deviance between
> nested models are additive. I.e. if model A is nested in B and B is
> nested in C (and hence A is nested in C), then the difference in
> deviance between A and B plus the difference between B and C = the
> difference between A and C.
>
> Even if the chisq between the full and the intercept-only model is
> larger, you'll immediately get an idea of how whether a lot or just a
> little of model improvement will be achieved by adding random slopes.
> So, even if you need to continue your comparisons to find the best
> model, you've learned something valuable at that moment.
>
> 2) In our typical psycholinguistic experiment, participants are exposed
> to a barrage of lexically and structurally rather homogeneous stimuli.
> Researchers and paradigms, of course, differ in how a typical set of
> stimuli looks, but if you live in the "The plumber that the janitor knew
> is from New York." stimulus world and you 'successfully' created a
> stimulus set that seeks its match in terms of homogeneity (a match it
> most certainly won't find in actual language use ;), then, in my
> experience, you can expect your item random effects to be rather small.
> I don't necessarily think that's a great thing, but it's convenient with
> regard to the statistical issues at hand. If you work with such stimuli,
> I recommend starting by comparing the full model to a model with only
> random subject effects:
>
> 1 + Condition1*Condition2 + (1 + Condition1*Condition2 | Participant)
>
> Again, if the chisq is really small, you don't need to look into item
> effects any further.
>
>
> 3) Hal Tily has developed a function (which he posted a while ago to
> this list) that automatically performs stepwise model comparison. While
> I am *not* advocating to blindly rely on automatic model simplification,
> I think this procedure -if restricted to random effects- could be
> acceptable for balanced designs. I think Roger Levy and Hal are working
> on a new an improved implementation, which hopefully will soon be
> available to this list (yippie!).
>
> Ok, sorry, for the unsolicited, long and perhaps trivial (?) email.
> Also, I am sure I've screwed up somewhere, so feel free to correct,
> expand. I just had the feeling that the general procedure outlined above
> wasn't clear to some (many?) readers of this list.
>
> Florian
>
>
>
> On Sun, Feb 20, 2011 at 8:08 AM, Zhenguang Garry Cai <z.cai@ed.ac.uk
> <mailto:z.cai@ed.ac.uk>> wrote:
>
>     Hi Ariel,
>
>     Sorry for the somewhat misleading information. I do not recall any
>     formal requirement from JML for random slopes, but in a recent
>     submission to JML, I was required to include random slopes. Also, as
>     Roger Levy said, including random slopes is a sensible thing to do.
>
>
>     Garry
>
>
>     On 20/02/2011 12:37, Ariel M. Goldberg wrote:
>
>         Garry,
>
>         Does JML have a set of specific requirements for mixed-effects
>         analyses?  I wasn't able to find anything in their author
>         instructions.
>
>         Thanks,
>         AG
>
>         On Feb 13, 2011, at 1:17 PM, Zhenguang Cai wrote:
>
>             Hi all,
>
>             I have a question concerning random slopes in mixed effects
>             modeling. So I ran a structural priming experiment with a
>             4-level variable (prime type, A, B, D and D). The dependent
>             variable is response construction (DO dative vs. PO dative).
>               The following is a summary of the experiment results.
>
>             Prime           A       B       C       D
>             DOs             85      24      38      59
>             POs             82      144     128     109
>             % of DOs        .51     .14     .23     .35
>
>             I am interested in whether different prime types induced
>             different priming, e.g., whether A led to more DO responses
>             than B, C or D. Initially, I ran LME analyses with random
>             intercepts only. For instance, I did the following to see
>             whether there was a main effect of prime type.
>
>             fit.0 = lmer(Response~1+(1|Subject)+(1|Item), family=binomial)
>             fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
>             family=binomial)
>             anova (fit.0, fit.p)
>
>             Then, I did pairwise comparison by changing the reference
>             level for Prime, e.g.,
>
>             fit.p =
>             lmer(Response~relevel(Prime,ref="B")+(1|Subject)+(1|Item),
>             family=binomial)
>
>             It seems that all the levels differed from each other. In
>             particular, the comparison between C and D results in
>             Estimate = -1.02, SE = .32, Z
>             = -3.21, p<  .01.
>
>             But it seems I have to consider whether the slope for Prime
>             differs across subjects or item (at least this is a
>             requirement from JML). So the following is the way I
>             considered whether a random slope should be included in a
>             model. I wonder whether I did the correct thing. I first
>             determined whether subject random slope should be included
>             by comparing the following two models.
>
>             fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
>             family=binomial)
>             fit.ps <http://fit.ps> =
>             lmer(Response~Prime+(Prime+1|Subject)+(1|Item), family=binomial)
>             anova (fit.p, fit.ps <http://fit.ps>)
>
>             I did the same thing about item random slopes.
>
>             fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
>             family=binomial)
>             fit.pi = lmer(Response~Prime+(1|Subject)+(Prime+1|Item),
>             family=binomial)
>             anova (fit.p, fit.pi)
>
>             The subject random slope had a significant effect, so I
>             included it in the final model (e.g., fit.ps
>             <http://fit.ps>). But pairwise comparison returned something
>             that is different from my initial analyses (when random
>             slope was not considered). That is, the comparison between C
>             and D became only marginally significant (Estimate = -.85,
>             SE = .47, z = -1.79, p = .07). It is a bit strange because
>             the 9% difference between B and C turned out to be
>             significant, but the 12% difference between C and D was not.
>
>             Or did I do anything wrong in the analyses?
>
>             Thanks,
>
>             Garry
>
>
>
>
>
>
>
>