[R-lang] Re: Investigating random slope variance

Fri Apr 4 08:54:58 PDT 2014

Just to add: the BLUPs are not guaranteed to come out normally distributed at all. They generally parallel the normality or non-normality of the underlying data. Normally distributed random effects are an assumption of the mixed model, but not one that the software enforces. 

As far as I have seen. 

Dan

> On 4 Apr 2014, at 16:48, Scott Jackson <scottuba@gmail.com> wrote:
> 
> One small point to add about the shrinkage: you should see more shrinkage for items with less data.  This is also a desired result (usually), since items/subjects with relatively little data tend to have more extreme results than they should. For example, think of a case where an item has only two observations and both are "correct" when measuring accuracy. That item would end up with an infinite logit accuracy, so a LOT of shrinkage is a good thing!  Not sure if different amounts of data are driving where you're seeing the most shrinkage in your data, but it's something to look for.
> 
> The more general point is that lmer is estimating just the variance of the random effect (i.e., how much item-level variance there seems to be), and the BLUPs represent an attempt to characterize how the items fall within this estimated variance -- they are not individually "estimated" from the data in the same sense as the other parameters in the model.  If the "true" effects of the individual items are distributed not at all normally, then you could also see a kind of misfit between the empirical estimates/means and the BLUPs (which are virtually guaranteed to be distributed normally, AFAIK).
> 
> In any case, to the extent that you have a lot of data on each item, similar amounts of data on each item, and to the extent that their effects *are* distributed fairly normally, the BLUPs and empirical estimates will be relatively close to each other.  If this is not the situation, things can differ quite a lot between the empirical estimates and the BLUPs, but this is typically a good thing, in terms of how we usually want to interpret item effects.  If, when thinking about the design of your items/regions, you have reason to believe that they should *not* show a normal distribution of effects (maybe you believe there are two distinct "types" of items/effect, or something), then you may need to resort to a more flexible Bayesian estimation procedure, where you can specify the expected (aka prior) distribution of the random effects to be something other than normal.
> 
> I have nothing insightful to say about the intercept/slope correlations :-)
> 
> best,
> -scott
> 
> 
> 
> 
>> On Fri, Apr 4, 2014 at 11:20 AM, Titus von der Malsburg <malsburg@posteo.de> wrote:
>> 
>> On 2014-04-04 Fri 05:10, T. Florian Jaeger <tiflo@csli.stanford.edu> wrote:
>> > I would be careful making anything out of this. The BLUP estimates of the
>> > random effects (and, I assume, their distribution) are affected by
>> > shrinkage, which is often a desirable (conservative) feature, although it
>> > will make differences appear smaller. So, it's not surprising that the
>> > fixed effect model mirrors the empirical means more closely. That doesn't
>> > mean though that it's the better model to draw conclusion from (about those
>> > differences).
>> 
>> Florian, your comment is spot on.  Here is a plot showing the effect of
>> shrinkage in my data set:
>> 
>>     http://users.ox.ac.uk/~sjoh3968/R/effect_of_shrinkage.png
>> 
>> Unfilled circles show the empirical mean reading times and differences
>> between conditions, one circle for each item.  The dots show the BLUP
>> estimates for each item.
>> 
>> The difference is fairly dramatic.  I assumed that shrinkage would pull
>> all data points to the mean with the same force (I have the same amount
>> of data for all items).  If that were the case, the ordering of items
>> would be preserved.  However, shrinkage affects the individual items in
>> quite different ways, and some items are even pushed away from the
>> overall means (1, 5, 7, 8, 9, 10, 13, 14, 35) effectively expanding a
>> subset of the estimates instead of shrinking them.
>> 
>> I must say that I find it hard to swallow that two seemingly valid ways
>> to analyze the data (item as random effect or fixed effect) yield
>> results that are so different.
>> 
>> Another observation: in the BLUP estimates, the correlation of
>> intercepts and slopes seems to be much higher than in the raw data.  The
>> correlation of the estimated random intercepts and slopes is -0.86. (The
>> summary of the model reports -0.62.)  The correlation of the empirical
>> item means and differences is only -0.4.  Why does lmer believe in such
>> a high correlation?
>> 
>>   Titus
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20140404/374a7946/attachment-0001.html