[R-lang] Re: p-values from pvals.fnc

Mon Aug 1 08:11:31 PDT 2011

Just a follow-up on this: Nathaniel is exactly right in saying that this clever reparameterization doesn't put us in business for MCMC sampling within lme4, because averaging over the uncertainty in the random-effects covariance matrix, which is the whole point of the MCMC sampling in the first place, should also take into account uncertainty in the correlation parameters.  This is, e.g., what Baayen, Davidson, & Bates 2008 are saying when they write (bottom of page 396):

  "...the original purpose of the t and F-distributions is to take into account the imprecision in the estimate of the variance of the random disturbances when formulating inferences regarding the fixed-effects parameters...This is what the t and F-distributions accomplish in the case of models with fixed-effects only. Crucially, the MCMC technique applies to more general models and to data sets with arbitrary structure."

I should also point out that if one is truly intent on drawing fixed-effects inferences with random-slope (or richer) mixed models using MCMC samples, one can implement the mixed-effects model in BUGS or JAGS and just go and sample from the posterior; JAGS in particular is cross-platform and has a great R interface.  Gelman and Hill (2007) has examples of this; I have a couple of examples in the Hierarchical Models chapter of my textbook draft as well.  (Unfortunately, this is currently much more time-consuming and error-prone than just writing out an lme4 formula...!)

Best

Roger

On Jul 31, 2011, at 6:58 PM, <kliegl@uni-potsdam.de>
 wrote:

> Yes, I see your point. Thanks.
> 
> So one would need to accumulate evidence for a consistent offset across experiments in a specific domain. This offset (if it is reliable) could be applied before the LMM. Then, one checks the significance of the correlation parameters with the expectation that it will not be significant. Finally, you are in business with a model w/o correlation parameter.
> 
> Reinhold Kliegl
> 
> Quoting Nathaniel Smith <njs@pobox.com>:
> 
>> On Sun, Jul 31, 2011 at 4:24 PM,  <kliegl@uni-potsdam.de> wrote:
>>> Your last paragraph (What do we gain by the exercise?) nicely summarizes
>>> what motivated my question about the generalization. Remember this thread
>>> got started because in the current lme4 implementation mcmcsamp() [or
>>> wrappers like HPDinterval(), pvals.fnc() and friends] do not work for models
>>> with random correlation parameters. Psycholinguists and psychologists would
>>> like to use MCMC to get CIs primarily for the fixed effect estimates.
>> 
>> Right. The problem is, essentially, that mcmcsamp() does not know how
>> to resample the correlation parameter.
>> 
>>> Jon's proposal was to reparameterize the models in a way that the
>>> correlation parameter is zero. Then, we can use mcmcsamp() to get pvalues
>>> and reviewers/editors will be happy. I am pretty sure that it is only a
>>> reparameterization because logLIK, deviance, and REMLdev do not change, they
>>> are the same for the three models in the illustration. Therefore, in the
>>> simple varying intercept/varying slope model, I think you get usable MCMC
>>> statistics for the fixed effect of slope with this trick, because the offset
>>> on X does not change the interpretation of the slope. I do not see anything
>>> anti-conservative here. Am I missing something?
>> 
>> What I *think* you may be missing is that this reparametrization
>> depends on the data. So, you could apply the same reparametrization to
>> other data, and the logLik, deviance and REMLdev would be the same as
>> if you didn't reparametrize. But on this new data, the
>> reparametrization will probably not produce a zero correlation
>> coefficient -- you'd have to calculate the right reparametrization for
>> that new data, and it would be different.
>> 
>> So the point is, there is some uncertainty about the right
>> reparametrization to use. To get proper p-values, mcmcsamp() should
>> resample the reparametrization, but it doesn't know how to do that
>> either.
>> 
>> If we are willing to assume that our estimates for the random effect
>> parameters are correct, then it's easy to get exact p-values without
>> any MCMC or anything. The whole reason we need MCMC is that there is
>> noise in our estimates of random effect parameters, and this
>> reparametrization technique doesn't take that noise into account.
>> IIUC.
>> 
>> -- Nathaniel
>> 
> 
> 
> 

--

Roger Levy                      Email: rlevy@ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://idiom.ucsd.edu/~rlevy