[R-lang] Re: Random slopes in LME
Zhenguang Cai
s0782345@sms.ed.ac.uk
Sun Feb 20 12:57:55 PST 2011
Hi Florian,
This helps a lot. There seems to be lack of consensus as to how to
conduct LME analyses. I saw published papers using intercept-only models
only.
I have a minor question: For a 3-level predictor, can we determine
whehter it has a sig. main effect and also calculate the pairwise
comparisons among levels? I used a very clumsy approach. First, I
applied centering to the fixed predictor so that the summary of the
best-fit model gives you estimate, SE, z/t as well as p-value. To
calculate pairwise comparison, I then built the same model with
uncentered fixed predictors and then using multcomp. I suppose there
would be a single-step way?
Garry
于 2011-2-20 20:28, T. Florian Jaeger 写道:
> Hi R-lang,
>
> maybe this is a good time to comment on a more general issue that Roger
> also brought up. As more and more people employ mixed models to analyze
> their data, we will need some conventions in terms of the procedures
> employed and the aspects of the analysis that are reported. While that
> is probably unattainable for the most general case, I think it's a
> reasonably achievable goal for standard 2 x 2 (or, by extensions,
> factorial) psycholinguistic experiments with more or less balanced data.
>
> In the meantime, I have some comments on the specific issue of random
> slopes. For a factorial design, I would submit that it should be the
> norm that we test the full factorial random effect structure with cross
> subject and item random effects. The goal of this procedure should be
> what I have called the "maximal random effect structure justified by the
> data" (which, of course, is a bit of a shorthand, since it's really the
> maximal random effect structure justified by the data under a set of
> assumptions, such as that the assumptions of the generalized linear
> mixed model and the particular family of distributions are met to a
> sufficient extent, that there is sufficient data to justify model
> comparison using a chisq over differences in deviances, that there is no
> overfitting issue, etc.).
>
> To see why we should test whether random slopes are required (and
> include them in the mode, if they are required), it might be helpful to
> recall what random effect structure in a linear mixed model would most
> closely correspond to our good old friends, the F1 and F2 (by-subject
> and by-item) ANOVA. For example, the F1 ANOVA is intended to not just
> capture between-subject differences in terms of the intercept (e.g.
> overall reading time differences, if we assume sum-coding). It's also
> meant to account for between-subject differences in the sensitivity to
> /all/ of the design conditions (e.g. the two main effects and the
> interaction of a 2x2 design). So, the linear mixed model that most
> closely corresponds to the F1 ANOVA has the following random effect
> structure:
>
> (1 + Condition1 + Condition2 + Condition1:Condition2 | Participant)
>
> or shorter
>
> (1 + Condition1*Condition2 | Participant)
>
> and, similarly, the closest thing to an F2 analysis would be
>
> (1 + Condition1*Condition2 | Item)
>
> hence, to take maximal advantage of the fact that linear mixed models
> allow us to combine both analyses into one (reducing, among other
> things, the conservativity of minF analyses), the starting model should
> probably have both of these as crossed random effects, plus, obviously,
> the fixed effects for the design:
>
> 1 + Condition1*Condition2 + (1 + Condition1*Condition2 | Participant)
> + (1 + Condition1*Condition2 | Item)
>
> I'll call this the full model (not the same as the model with "the
> maximal random effect structure justified by the data"!). Now, not all
> of the random effects may be justified, as we can test by model
> comparison. For example, the model might not even converge. So, we can
> simplify the model stepwise, starting with the higher order random
> effect terms (the highest interaction). I am assuming that we will leave
> the fixed effect structure untouched since it's directly justified by
> the theoretical interests that led us to create the 2x2 design.
>
> As we simplify the random effect structure stepwise, there are some tips
> that I find useful (see also
> http://hlplab.wordpress.com/2009/05/14/random-effect-structure/ for a
> simple recipe):
>
> 1) Let's start by comparing the full model with a model with just the
> random intercepts:
>
> 1 + Condition1*Condition2 + (1 | Participant) + (1 | Item)
>
> Let's call this the intercept-only model. I'll assume that we have
> enough data to justify model comparison using a chisq over differences
> in deviances (that will usually be the case for a reasonably sizes
> psycholinguistic experiment). I'll also assume ML-fitted models. If the
> difference in deviance between the full model and intercept-only model
> is small, i.e. if the chisq of the model comparison has a value of less
> than 3 (or 2, if you want to be conservative; if I recall correctly even
> a chisq(1)< 3.8 is technically no significant), we don't really need to
> look further. There is no way then that any of the intermediate models
> could be significant. That's because difference in deviance between
> nested models are additive. I.e. if model A is nested in B and B is
> nested in C (and hence A is nested in C), then the difference in
> deviance between A and B plus the difference between B and C = the
> difference between A and C.
>
> Even if the chisq between the full and the intercept-only model is
> larger, you'll immediately get an idea of how whether a lot or just a
> little of model improvement will be achieved by adding random slopes.
> So, even if you need to continue your comparisons to find the best
> model, you've learned something valuable at that moment.
>
> 2) In our typical psycholinguistic experiment, participants are exposed
> to a barrage of lexically and structurally rather homogeneous stimuli.
> Researchers and paradigms, of course, differ in how a typical set of
> stimuli looks, but if you live in the "The plumber that the janitor knew
> is from New York." stimulus world and you 'successfully' created a
> stimulus set that seeks its match in terms of homogeneity (a match it
> most certainly won't find in actual language use ;), then, in my
> experience, you can expect your item random effects to be rather small.
> I don't necessarily think that's a great thing, but it's convenient with
> regard to the statistical issues at hand. If you work with such stimuli,
> I recommend starting by comparing the full model to a model with only
> random subject effects:
>
> 1 + Condition1*Condition2 + (1 + Condition1*Condition2 | Participant)
>
> Again, if the chisq is really small, you don't need to look into item
> effects any further.
>
>
> 3) Hal Tily has developed a function (which he posted a while ago to
> this list) that automatically performs stepwise model comparison. While
> I am *not* advocating to blindly rely on automatic model simplification,
> I think this procedure -if restricted to random effects- could be
> acceptable for balanced designs. I think Roger Levy and Hal are working
> on a new an improved implementation, which hopefully will soon be
> available to this list (yippie!).
>
> Ok, sorry, for the unsolicited, long and perhaps trivial (?) email.
> Also, I am sure I've screwed up somewhere, so feel free to correct,
> expand. I just had the feeling that the general procedure outlined above
> wasn't clear to some (many?) readers of this list.
>
> Florian
>
>
>
> On Sun, Feb 20, 2011 at 8:08 AM, Zhenguang Garry Cai <z.cai@ed.ac.uk
> <mailto:z.cai@ed.ac.uk>> wrote:
>
> Hi Ariel,
>
> Sorry for the somewhat misleading information. I do not recall any
> formal requirement from JML for random slopes, but in a recent
> submission to JML, I was required to include random slopes. Also, as
> Roger Levy said, including random slopes is a sensible thing to do.
>
>
> Garry
>
>
> On 20/02/2011 12:37, Ariel M. Goldberg wrote:
>
> Garry,
>
> Does JML have a set of specific requirements for mixed-effects
> analyses? I wasn't able to find anything in their author
> instructions.
>
> Thanks,
> AG
>
> On Feb 13, 2011, at 1:17 PM, Zhenguang Cai wrote:
>
> Hi all,
>
> I have a question concerning random slopes in mixed effects
> modeling. So I ran a structural priming experiment with a
> 4-level variable (prime type, A, B, D and D). The dependent
> variable is response construction (DO dative vs. PO dative).
> The following is a summary of the experiment results.
>
> Prime A B C D
> DOs 85 24 38 59
> POs 82 144 128 109
> % of DOs .51 .14 .23 .35
>
> I am interested in whether different prime types induced
> different priming, e.g., whether A led to more DO responses
> than B, C or D. Initially, I ran LME analyses with random
> intercepts only. For instance, I did the following to see
> whether there was a main effect of prime type.
>
> fit.0 = lmer(Response~1+(1|Subject)+(1|Item), family=binomial)
> fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
> family=binomial)
> anova (fit.0, fit.p)
>
> Then, I did pairwise comparison by changing the reference
> level for Prime, e.g.,
>
> fit.p =
> lmer(Response~relevel(Prime,ref="B")+(1|Subject)+(1|Item),
> family=binomial)
>
> It seems that all the levels differed from each other. In
> particular, the comparison between C and D results in
> Estimate = -1.02, SE = .32, Z
> = -3.21, p< .01.
>
> But it seems I have to consider whether the slope for Prime
> differs across subjects or item (at least this is a
> requirement from JML). So the following is the way I
> considered whether a random slope should be included in a
> model. I wonder whether I did the correct thing. I first
> determined whether subject random slope should be included
> by comparing the following two models.
>
> fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
> family=binomial)
> fit.ps <http://fit.ps> =
> lmer(Response~Prime+(Prime+1|Subject)+(1|Item), family=binomial)
> anova (fit.p, fit.ps <http://fit.ps>)
>
> I did the same thing about item random slopes.
>
> fit.p = lmer(Response~Prime+(1|Subject)+(1|Item),
> family=binomial)
> fit.pi = lmer(Response~Prime+(1|Subject)+(Prime+1|Item),
> family=binomial)
> anova (fit.p, fit.pi)
>
> The subject random slope had a significant effect, so I
> included it in the final model (e.g., fit.ps
> <http://fit.ps>). But pairwise comparison returned something
> that is different from my initial analyses (when random
> slope was not considered). That is, the comparison between C
> and D became only marginally significant (Estimate = -.85,
> SE = .47, z = -1.79, p = .07). It is a bit strange because
> the 9% difference between B and C turned out to be
> significant, but the 12% difference between C and D was not.
>
> Or did I do anything wrong in the analyses?
>
> Thanks,
>
> Garry
>
>
>
>
>
>
>
>
More information about the ling-r-lang-L
mailing list