[R-lang] Re: Investigating random slope variance

Mon Apr 7 15:16:18 PDT 2014

On Apr 7, 2014, at 8:09 AM, Titus von der Malsburg <malsburg@posteo.de> wrote:

> 
> On 2014-04-04 Fri 21:48, Levy, Roger <rlevy@ucsd.edu> wrote:
>> On Apr 4, 2014, at 8:20 AM, Titus von der Malsburg <malsburg@posteo.de> wrote:
>>> Here is a plot showing the effect of
>>> shrinkage in my data set:
>>> 
>>>   http://users.ox.ac.uk/~sjoh3968/R/effect_of_shrinkage.png
>> 
>> [...] But this graph looks a bit off — I don’t like the number of
>> lines that cross 0.0 on the y axis.  Are you sure you’re not doing
>> something subtle that makes the comparison not apples-to-apples, like
>> taking means before log-transforming in computing the empirical means?
> 
> Hi Roger,
> 
> I took a shortcut when calculating the intercepts and slopes for the
> random effects model.  By doing that I was indeed missing a part of
> the action.  Here's a new plot that I generated using lme4's predict
> function (code at the bottom of this mail):
> 
>    http://users.ox.ac.uk/~sjoh3968/R/effect_of_shrinkage2.png
> 
> This looks much more consistent with the characterization of the BLUPs
> as weighted averages.  (The data points are vertically not centered on
> zero because the predict function also added in the main effect of
> condition, 0.06224.)

Thanks Titus — as you point out, this looks much more consistent.  (And as I’m sure you’re thinking, the fact that shrinkage is not toward 0.0 on the y axis is because of the large negative correlation between the intercept and the slope in the fitted model.) Though I’m still surprised that there’s so much more shrinkage in the y direction than in the x direction, despite the fact that the random slope standard deviation is so much smaller than that of the random intercept.

> 
> To remind you of the original question: I wanted to know which items are
> read significantly faster or slower in the manipulated condition.  Based
> on the BLUPs, these are items 25 and 36.  

So, hold on: are you interested in (i) for which items can you conclude with (1-p)% confidence that the total effect of the manipulation is significantly non-zero in a particular direction [item-average slope + item-specific slope], or (ii) for which items can you conclude that their idiosyncratic sensitivity to the manipulation, above and beyond the population-average sensitivity, is significantly non-zero in a particular direction [only item-specific slope]?  My impression is that the inferences you’re drawing from both your fixed-effect and random-effect models are addressing question (ii), but your statement above makes it sound like you’re interested in (i).

> (According to the model that
> uses item as a fixed effect these are 11, 23, 35, 36.)  Florian
> explained why I should trust the BLUP estimates.  While I understand the
> logic, I hesitate the believe in the BLUP results: The reason why 25
> comes out as significant is that lmer believes in a very high
> correlation of random intercepts and slopes (-0.62 according to the
> summary, -0.86 in the BLUP estimates, but only -0.4 in the data).  For
> this reason items 25 and 36 (which have rather extreme intercepts) are
> shrunken much less than items 11 and 23 (which have average
> intercepts).  So whether the conclusions from the BLUPs hold up depends
> on whether the correlation between intercepts and slopes is really as
> high as lmer thinks it is.  Without understanding how exactly lmer
> derives these correlations I have some doubts, in part because from
> experience I know that lmer tends to overestimate these correlations
> particularly with sparse data.  Bottom line: It seems I can trust
> neither the BLUP estimates nor the fixed-effect model.

Your concern about the correlation coefficient in the random-effects covariance matrix seems reasonable to me.  I don’t know how the new lme4’s ranef() function extracts confidence intervals, but if it does so conditional on the point estimate of the random-effects covariance matrix, then the standard caveat applies that this kind of approach fails to take into account uncertainty in that covariance matrix.  For this reason you might want to consider Bayesian inferential methods.  My textbook-in-progress has some examples of how you can set these up in JAGS, I think Shravan has pedagogical materials now that show how to do this in Stan, and there are R packages that may be useful for this more directly (e.g., MCMCglmm).  That way you can inspect the posterior on the correlation parameter.

I would note, though, that a substantial negative correlation is evident in the empirical means on your latest plot, so the negative correlation is not unreasonable.

Regarding the fixed-effect vs. random-effect approach question: I agree with Florian that the random-effects approach is preferable in this case.  Let’s put the question this way: when inferring the slope for a given item i, do you want the distributional information that can be gleaned from all the OTHER items besides i to inform your inferences about i?  If so, then you want to use the random-effects approach.  If not, why not?  The issues at stake can get rather subtle, but I think that the weight of the argument will probably support the random-effects approach.

Best

Roger

>  Titus
> 
> # Compiling item-level data:
> 
> # Descriptive stats:
> x <- with(d, tapply(log(tt), list(item, cond), mean, na.rm=T))
> items <- data.frame(item=1:40,
>                    sample.mean=with(d, tapply(log(tt), item, mean, na.rm=T)),
>                    sample.difference=x[,2] - x[,1])
> 
> # Results random effects model:
> tt <- lmer(log(tt) ~ cond * Region.length + (1|subj) + (cond|item), d)
> ptt <- predict(tt)
> d$ptt[as.integer(names(ptt))] <- ptt
> items$random.intercept <- with(d, tapply(ptt, item, mean, na.rm=T))
> x                      <- with(d, tapply(ptt, list(item, cond), mean, na.rm=T))
> items$random.slope    <- x[,2] - x[,1]
> 
> with(items,
>     plot(sample.mean, sample.difference,
>          main="Effect of shrinkage on random intercepts and slopes for items",
>          xlab="mean reading time",
>          ylab="difference between conditions"))
> 
> for (i in 1:39) {
>  with(items, arrows(sample.mean[i], sample.difference[i],
>                     random.intercept[i], random.slope[i],
>                     length=0.1, col="darkgrey"))
> }
> with(items, points(random.intercept, random.slope, pch=20))
> with(items, text(sample.mean+0.05, sample.difference, labels=1:39))
> grid()
> # "zero"-line at the level of the fixed effect of condition:
> abline(fixef(tt)[2], 0)