[R-lang] Re: p-values from pvals.fnc

Nathaniel Smith njs@pobox.com
Sun Jul 31 14:39:47 PDT 2011


How clever! The basic idea, if I understand correctly, is: we can
estimate the full ML covariance matrix for the random effects; this
describes a multivariate Gaussian distribution with some arbitrary
non-spherical skew to it. There's some linear transformation that
straightens out that skew (i.e., that diagonalizes the matrix), and
makes this distribution spherical (this is PCA, basically). If we
apply some version of that transformation to the predictor variables,
then we get transformed predictors whose ML covariance matrix for the
random effects is diagonal.

So I think you can do this for arbitrary sets of predictors by using a
full PCA. Here we added a constant offset, which is a linear
transformation of the intercept term; in general, I think you need to
transform the kth predictor by adding some linear combination of the
first (k-1) predictors.

What I'm not so clear on is what we gain by doing this exercise. It
lets us produce a simplified model, but that simplification is based
on the assumption that we know exactly the correct covariance term,
when in fact we just estimated it. Any statistics we do using the
simplified model will fail to take this source of variance into
account. Of course this is the same problem that motivates using
mixed-effect models in the first place -- if we knew the values of the
random effects a priori, then we could just use GLS and be done,
without messing around with hierarchical models at all. But that would
be anti-conservative... and any inferences you make in the simplified
model will be anti-conservative with respect to the full model, even
if you use an otherwise normative technique. So it seems like this
trick doesn't actually help us with the underlying problem of doing
inference in the presence of correlated random effects?

-- Nathaniel

On Sun, Jul 31, 2011 at 2:15 PM,  <kliegl@uni-potsdam.de> wrote:
> Hi Roger,
>
> Here you go.
>
> U. Halekoh: Generalized Linear Mixed Models (GLMM), April 3, 2007
> http://www.scribd.com/doc/56561634/GLMM-Ulrich-Halekoh
> (see section 13, page 66)
>
> (1)  y_it = mu + a_i + b_i*x_t + e_it
> (Basic eq with usual distr assumptions; assume cov(a,b) != 0 )
>
> (2)  z_t = x_t + d
> (Off-set d)
>
> (3) y_it ) = mu + (a_i - b_i*d) + b_i*(x_t + d) + e_it =
>             = mu + a'_i + b_i*z_t + e_it
> where a'_i = a_i - b_i*d
>
> Now compute variance of a'_i and its covariance with b_i:
> (4) var(a'_i) = var(a_i) + d*d*var(b_i) - 2*b_i*d*cov(a_i, b_i)
> (5) cov(a'_i, b_i) = cov(a_i, b_i) - d*var(b_i)
>
> If d = d_0 = cov(a_i, b_i)/var(b_i) in (5), then cov(a'_i, b_i) = 0
>
> I hope I got the subscripts right. Any idea, whether this can be generalized
> to models with X1, X2, ... Xn?
>
> Best
> Reinhold
>
> --
> Reinhold Kliegl
> http://read.psych.uni-potsdam.de/pmr2
>
> Quoting "Levy, Roger" <rlevy@ucsd.edu>:
>
>> Hi Reinhold & co.,
>>
>> I'm not sure whether this is exactly what Jon was suggesting, but it
>>  seems pretty clever!  Do you think you could point us to the handout
>>  and/or summarize this derivation?
>>
>> Best
>>
>> Roger
>>
>>
>> On Jul 30, 2011, at 4:29 PM, <kliegl@uni-potsdam.de>
>>  <kliegl@uni-potsdam.de> wrote:
>>
>>> Jon's proposal appears to work if we do not care about the  intercept of
>>> X. Also, we need to estimate the parameters for the  full model first to
>>> compute the X-offset for the zero  intercept-slope correlation parameter.
>>> Specifically, if a and b are  intercept and slope, then the offset for X to
>>> obtain a zero  intercept-slope correlation parameter is cov(a,b)/ var(b). (I
>>> saw  this derivation, which also yields a minimum variance estimate for  the
>>> intercept, in a 2007 GLMM handout by Ulrich Halekoh.)  To  illustrate this
>>> with the sleepstudy data.
>>>
>>>> (fm1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy))
>>>
>>> Linear mixed model fit by REML
>>> Formula: Reaction ~ Days + (Days | Subject)
>>>  Data: sleepstudy
>>>  AIC  BIC logLik deviance REMLdev
>>> 1756 1775 -871.8     1752    1744
>>> Random effects:
>>> Groups   Name        Variance Std.Dev. Corr
>>> Subject  (Intercept) 612.092  24.7405
>>>         Days         35.072   5.9221  0.066
>>> Residual             654.941  25.5918
>>> Number of obs: 180, groups: Subject, 18
>>>
>>> Fixed effects:
>>>           Estimate Std. Error t value
>>> (Intercept)  251.405      6.825   36.84
>>> Days          10.467      1.546    6.77
>>>
>>> # Compute the offset and add it to Days
>>>
>>>> sleepstudy$DaysA <- sleepstudy$Days + (0.066*24.7405*5.9221)/35.072
>>>> (fm3 <- lmer(Reaction ~ DaysA + (DaysA|Subject), sleepstudy))
>>>
>>> Linear mixed model fit by REML
>>> Formula: Reaction ~ DaysA + (DaysA | Subject)
>>>  Data: sleepstudy
>>>  AIC  BIC logLik deviance REMLdev
>>> 1756 1775 -871.8     1752    1744
>>> Random effects:
>>> Groups   Name        Variance Std.Dev. Corr
>>> Subject  (Intercept) 609.460  24.6872
>>>         DaysA        35.072   5.9221  0.000
>>> Residual             654.941  25.5918
>>> Number of obs: 180, groups: Subject, 18
>>>
>>> Fixed effects:
>>>           Estimate Std. Error t value
>>> (Intercept)  248.519      6.896   36.04
>>> DaysA         10.467      1.546    6.77
>>>
>>> Now to the model with two independent variance components for which  the
>>> MCMC p-values can be computed:
>>>>
>>>> (fm4 <- lmer(Reaction ~ DaysA + (1|Subject) + (0+DaysA|Subject),
>>>>  sleepstudy))
>>>
>>> Linear mixed model fit by REML
>>> Formula: Reaction ~ DaysA + (1 | Subject) + (0 + DaysA | Subject)
>>>  Data: sleepstudy
>>>  AIC  BIC logLik deviance REMLdev
>>> 1754 1770 -871.8     1752    1744
>>> Random effects:
>>> Groups   Name        Variance Std.Dev.
>>> Subject  (Intercept) 609.346  24.6849
>>> Subject  DaysA        35.066   5.9217
>>> Residual             654.951  25.5920
>>> Number of obs: 180, groups: Subject, 18
>>>
>>> Fixed effects:
>>>           Estimate Std. Error t value
>>> (Intercept)  248.519      6.896   36.04
>>> DaysA         10.467      1.546    6.77
>>>
>>> Not sure how well this generalizes but it appears that Jon's  proposal
>>> could be implemented at least for the simple
>>>  varying-intercept/varying-slope model.
>>>
>>> Reinhold Kliegl
>>> http://read.psych.uni-potsdam.de/pmr2
>>>
>>>
>>> Quoting Jonathan Baron <baron@psych.upenn.edu>:
>>>
>>>> I have a question about this, which has been bugging me for a while.
>>>> It is very probably very naive, but here is a chance to ask it.
>>>>
>>>> pvals.fnc() does not deal with random slopes because it complains
>>>> about "random correlation parameters".  I'm not quite sure what this
>>>> means, but I think it may mean that it wants to take into account the
>>>> correlation of the slopes and intercepts, but it can't.  Consider the
>>>> following:
>>>>
>>>> l1 <- lmer(Y ~ X + (1+X|S)
>>>> l2 <- lmer(Y ~ X + (1|S) + (0+X|S)
>>>> l3 <- lmer(Y' ~ X' + (1|S) + (0+X'|S)
>>>>
>>>> pvals.fnc() does not work with l1.  It will work with l2, but I think
>>>> l2 imposes the assumption of no correlation of slopes and intercepts.
>>>> This may be approximately true, and maybe you can tell by looking at
>>>> the output of l1, where it gives correlations in the random effects
>>>> part.  (It also gives them for the fixed effects, but I think that is
>>>> not the issue, although I'm not sure.)
>>>>
>>>> If the assumption of no correlation isn't true, can't you transform X
>>>> and Y into X' and Y' (or maybe just transform one of them) by adding
>>>> or subtracting a constant?  I have tried various ways of doing this,
>>>> but none works for me except trial and error.  In many cases,
>>>> transformation by a constant would not change the substantive
>>>> inferences that we want to draw.
>>>>
>>>> Jon
>>>>
>>>> On 07/30/11 20:53, Jakke Tamminen wrote:
>>>>>
>>>>> Roger: Thanks for the information, I guess I have a lot of reading to
>>>>> do!
>>>>>
>>>>> Alex and Roger: Looks like my version of lme4 is pretty old, 0.99875-6.
>>>>> If
>>>>> the more recent versions don't give you p-values for models with random
>>>>> slopes, should I be looking at them (the p-values) at all, or rely on
>>>>> the
>>>>> t-statistic (and probably update my packages!)?
>>>>>
>>>>> Jakke
>>>>>
>>>>> On 30 July 2011 18:45, Alex Fine <afine@bcs.rochester.edu> wrote:
>>>>>
>>>>> > Jakke,
>>>>> >
>>>>> > I'm probably missing something, so I'm not replying-all.  How  do you
>>>>> > even
>>>>> > get pvals.fnc() to work with a model that has random slopes?  I have
>>>>> > the
>>>>> > most up-to-date version of the languageR package and it won't  take
>>>>> > models
>>>>> > that have anything other than random intercepts.
>>>>> >
>>>>> > thanks,
>>>>> > Alex
>
>
>
>
>



More information about the ling-r-lang-L mailing list