[R-lang] Re: Investigating random slope variance

Fri Apr 4 13:48:00 PDT 2014

Well, this thread has gotten really rich and interesting!  

On Apr 4, 2014, at 8:20 AM, Titus von der Malsburg <malsburg@posteo.de> wrote:

> 
> On 2014-04-04 Fri 05:10, T. Florian Jaeger <tiflo@csli.stanford.edu> wrote:
>> I would be careful making anything out of this. The BLUP estimates of the
>> random effects (and, I assume, their distribution) are affected by
>> shrinkage, which is often a desirable (conservative) feature, although it
>> will make differences appear smaller. So, it's not surprising that the
>> fixed effect model mirrors the empirical means more closely. That doesn't
>> mean though that it's the better model to draw conclusion from (about those
>> differences).
> 
> Florian, your comment is spot on.  Here is a plot showing the effect of
> shrinkage in my data set:
> 
>    http://users.ox.ac.uk/~sjoh3968/R/effect_of_shrinkage.png
> 
> Unfilled circles show the empirical mean reading times and differences
> between conditions, one circle for each item.  The dots show the BLUP
> estimates for each item.

Just a comment: yes, the effect of shrinkage should be larger for points farther away from the means (both mean intercept and mean slope) when # observations in the cell is constant (which I believe is pretty much the case for you). But this graph looks a bit off — I don’t like the number of lines that cross 0.0 on the y axis.  Are you sure you’re not doing something subtle that makes the comparison not apples-to-apples, like taking means before log-transforming in computing the empirical means?

Roger

> The difference is fairly dramatic.  I assumed that shrinkage would pull
> all data points to the mean with the same force (I have the same amount
> of data for all items).  If that were the case, the ordering of items
> would be preserved.  However, shrinkage affects the individual items in
> quite different ways, and some items are even pushed away from the
> overall means (1, 5, 7, 8, 9, 10, 13, 14, 35) effectively expanding a
> subset of the estimates instead of shrinking them.
> 
> I must say that I find it hard to swallow that two seemingly valid ways
> to analyze the data (item as random effect or fixed effect) yield
> results that are so different.
> 
> Another observation: in the BLUP estimates, the correlation of
> intercepts and slopes seems to be much higher than in the raw data.  The
> correlation of the estimated random intercepts and slopes is -0.86. (The
> summary of the model reports -0.62.)  The correlation of the empirical
> item means and differences is only -0.4.  Why does lmer believe in such
> a high correlation?
> 
>  Titus