[R-lang] Weird outcome logit mixed-effect model

Mon Dec 22 13:30:34 PST 2008

Laura de Ruiter wrote:
> Dear R-users and -experts,
> 
> I am performing a rather simple analysis on a small data set (pasted 
> below this email) and keep getting a to me inexplicable result. Perhaps 
> I am missing something here - it would be great if someone could point 
> out to me what I am doing wrong.
> 
> I want to test whether the factor "Info" (which has three levels: "new", 
> "given", "accessible") is a significant predictor for the binary 
> variable "DeaccYN". The random factor is "Subject". The distribution of 
> the data looks as follows:
> 
> ----------------------------------------------------------------------------- 
> 
>  
> Info
> DeaccYN given new accessible
> no 25 42 21
> yes 11 0 1
> ------------------------------------------------------------------------------ 
> 
> 
> This is the model:
> 
> ---------------------------------------------------------------------------------------------------------- 
> 
> deacc.lmer = lmer (DeaccYN ~ Info + (1|Subject), data = dat, family = 
> "binomial")
> ----------------------------------------------------------------------------------------------------------------- 
> 
> 
> However, given the distribution above, this outcome seems rather weird 
> to me:
> 
> --------------------------------------------------------------------------------------------------------- 
> 
> summary (deacc.lmer)
> Generalized linear mixed model fit using Laplace
> Formula: DeaccYN ~ Info + (1 | Subject)
> Data: dat
> Family: binomial(logit link)
> AIC BIC logLik deviance
> 60.4 70.82 -26.2 52.4
> Random effects:
> Groups Name Variance Std.Dev.
> Subject (Intercept) 0.18797 0.43356
> number of obs: 100, groups: Subject, 21
> 
> Estimated scale (compare to 1 ) 0.7316067
> 
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.8635 0.3795 -2.2754 0.0229 *
> Infonew -18.7451 2764.2445 -0.0068 0.9946
> Infoaccessible -2.2496 1.1186 -2.0110 0.0443 *
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
 > [...]
>
> ---------------------------------------------------------------------------------------------------- 
> 
> 
> Why should the difference between 25/11 and 21/1 be significant, but the 
> difference between 25/11 and 42/0 not? Very odd to me seems the standard 
> error of 2764!
>
 > [...]

> I was wondering: Is it perhaps a problem for the model that there are no 
> cases in the DeaccYN == "yes" category for Info == "given"? And if this 
                                                       ^^^^^
                                          I believe you mean "new" here.
> is the case, why?
> Am I overlooking something here?

Dear Laura,

Independently of the issue that Florian is raising...you are right that 
the lack of is a problem for the model (to be precise, it's a problem 
for estimating the significance of the parameter estimate using the z 
value).  The z value, which is the basis of the significance level 
reported in the lmer summary, is based on the Wald statistic, which is

   z = \hat{b} / StdError(\hat{b})

where \hat{b} is the parameter estimate.  However, for parameter 
estimates of large magnitudes (as is the case for "new" here), the 
standard error is inflated, which leads to a small wald statistic.  This 
problem is discussed in a number of places, including Agresti (1996, 
2002); here's a decent mention of it online:

   http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm

You can see this quite clearly if you add just one "yes/given" example 
at the end of your data frame, for example:

   101 132 new yes

Then the model gives (being sure to correctly specify the levels of "Info"):

 > summary(deacc.lmer)
Generalized linear mixed model fit by the Laplace approximation
Formula: Deacc ~ Info + (1 | Subject)
    Data: dat
    AIC   BIC logLik deviance
  69.78 80.24 -30.89    61.78
Random effects:
  Groups  Name        Variance Std.Dev.
  Subject (Intercept) 0.34332  0.58594
Number of obs: 101, groups: Subject, 21

Fixed effects:
             Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.8996     0.3936  -2.286  0.02226 *
Iaccessible  -2.2808     1.1460  -1.990  0.04656 *
Inew         -3.0050     1.1395  -2.637  0.00836 **

The next question, of course, is how to deal with the dataset you 
actually have.  If you weren't using a mixed-effects model, you could 
use a likelihood-ratio test by comparing your full model with a simpler 
model; the likelihood-ratio test isn't susceptible to problems with 
large parameter estimates the way the Wald test is.  For example:

 > deacc.glm = glm (Deacc ~ Info, data = dat, family = "binomial")
 > deacc.glm1 = glm (Deacc ~ Info1, data = dat, family = "binomial")
 > anova(deacc.glm,deacc.glm1,test="Chisq")
Analysis of Deviance Table

Model 1: Deacc ~ Info
Model 2: Deacc ~ Info1
   Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1        97     52.452
2        98     71.600 -1  -19.148 1.210e-05

In principle, you could do this with your mixed-effects model, being 
sure to use ML instead of REML fitting:

 > deacc.lmer = lmer (Deacc ~ Info + (1 | Subject), data = dat, family = 
"binomial",REML=F)
 > deacc.lmer1 = lmer (Deacc ~ Info1 + (1 | Subject), data = dat, family 
= "binomial",REML=F)
 > anova(deacc.lmer,deacc.lmer1)
Data: dat
Models:
deacc.lmer1: Deacc ~ Info1 + (1 | Subject)
deacc.lmer: Deacc ~ Info + (1 | Subject)
             Df     AIC     BIC  logLik Chisq Chi Df Pr(>Chisq)
deacc.lmer1  3  77.600  85.416 -35.800
deacc.lmer   4  60.400  70.821 -26.200  19.2      1  1.177e-05 ***

There is an argument that the likelihood-ratio test is anti-conservative 
and hence inappropriate for comparing mixed-effects models differing 
only in fixed-effects structure.  (See Pinheiro & Bates, 2000, around 
page 76 or so.  The argument is made only for linear mixed-effects 
models and I'm not sure of the status for logit mixed-effects models.) 
That being said, it's not clear how strong the anti-conservativity is, 
and in your case it seems like you have such an exceedingly powerful 
effect that you might be safe in using the likelihood-ratio test here 
and just mentioning the potential anti-conservativity as a caveat.

So the summary is: believe it or not, how to assess the significance of 
the parameter estimate for "new" for your model & dataset is a bit of an 
open question, but it seems pretty clear that the estimate is 
significantly non-zero.

Best & hope this helps.

Roger

-- 

Roger Levy                      Email: rlevy at ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy