[R-lang] Re: p-values for mixed effects models with random slopes

Tue Jan 18 10:50:15 PST 2011

Just some notes on what everyone else has written <see below>:

Alex Fine wrote:
> I would imagine many or most of you are aware of the fact that the 
> pvals.fnc function in the languageR package cannot currently estimate 
> p-values for models that include random slopes.  Because we all need 
> p-values, I'm guessing someone out there has developed a way of getting 
> them from such models.  Has anyone implemented something that does this, 
> or have any thoughts on this issue?

Sverre Stausland wrote:
> If that doesn't work, the standard approach to get p-values from
> linear mixed models is to do likelihood ratio tests. If you test fixed
> effects, though, remember to fit the models with maximum likelihood
> fitting, and _not_ with restricted maximum likelihood (i.e. include
> REML=FALSE in the call). (There's an explanation for why in Pinheiro &
> Bates' 2000 book).

On Jan 18, 2011, at 9:17 AM, Nathaniel Smith wrote:

> On Mon, Jan 17, 2011 at 9:19 PM, Steven Piantadosi <piantado@mit.edu> wrote:
>> Unless I'm out of date, p values are broken on glmer too? I wonder if an
>> easy solution to these two problems might be to implement a
>> bootstrapping/resampling algorithm on mixed effect regressions. Does
>> anyone know about this--would it be conservative or anticonservative or
>> a problem on data sets of typical size in psycholinguistics?
>> 
>> If this is actually a good idea, and someone could point me to a
>> reference on how bootstrapping would work on such models (I know
>> references for simple non-mixed effect regressions, but not how
>> bootstrapping interfaces with repeated subject/item measurements and
>> random effects), I'd be happy to try to put some friendly code
>> together.
> 
> The problem, as you say, is that you need your resampling procedure to
> somehow respect the structure you have in your data. If you just have
> one random effect (e.g., subjects), then things are relatively
> straightforward -- see Davison and Hinkley, section 3.8. They
> recommend just resampling subjects, which is slightly
> anti-conservative, but still closer to correct than the "natural"
> approach of first resampling subjects, and then resampling the cases
> within each resampled subject, which turns out to be rather
> conservative.
> 
> If you have crossed random effects -- which I guess we psycholinguists
> always do or we wouldn't be using lmer in the first place -- then
> things are trickier and I don't know what the best approach is.
> Probably some simulations and things are needed. A quick google finds
> a presentation on the 'merBoot' package, but I'm not sure it was ever
> released...

I think that there are a few basic options for the general problem of assessing significance of fixed effects in models with rich random-effects structure:

1) trust the normal approximation to the t- (and more generally the F-) statistic.  My consistent impression is that for models with parameters numbering in the low dozens and observations numbering in the hundreds or thousands, the anti-conservativity is minimal.  The examples from Pinheiro & Bates 2000 showing anticonservativity have far fewer observations per parameter.

2) ditto for the likelihood-ratio test.

3) The tools *are* available to do the type of fully Bayesian analysis done within pvals.fnc(), though the situation is not yet as user-friendly as with random-intercept models.  I have some pointers to the appropriate techniques in my textbook draft; see especially Chapter 8 and the end of Chapter 4:

   http://idiom.ucsd.edu/~rlevy/textbook/text.html

Any feedback on this material would be much appreciated!

Also: Nathaniel is absolutely correct that it's not as clear how to use resampling-based approaches (bootstrapping, permutation tests) in the crossed-cluster designs that are ubiquitous within psycholinguistics.  And situation is especially bad given that in a typical psycholinguistics experiment one has a maximum of one observation per participant/item combination!

More generally -- Hal Tily and I are currently working on a comparison of several methods for data analysis in the rich-random-effects situation.  Once we have our results fully systematized and written up, it'd be great to be able to get feedback from list members.

Best

Roger

--

Roger Levy                      Email: rlevy@ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://idiom.ucsd.edu/~rlevy