[R-lang] Re: trouble with random effects in factorial design
Hofmeister, Philip
phofme@essex.ac.uk
Tue Dec 3 07:13:35 PST 2013
Hey Daniel,
I don't know exactly what the answer here is, but there are a couple things worth noting that may lead to an answer.
One, I noticed your factors were dummy coded, so I changed that to sum coding, which allows the model to converge successfully w/o the control option at the end and gets rid of the high correlations between the factors.
> formula(mm.1)
response ~ focus * order + (focus * order | subject) + (focus * order | item)
> formula(mm.2)
response ~ order * focus + (order * focus | subject) + (order * focus | item)
Two, while the fixed effects are the same, the standard errors of the fixed effect coefficients (and intercept) are not.
library(arm)
se.fixef(mm.1)
(Intercept) focus1 order1 focus1:order1
0.17162262 0.10961756 0.11473215 0.08898726
se.fixef(mm.2)
(Intercept) order1 focus1 order1:focus1
0.17072794 0.11459095 0.10884840 0.08973463
Three, your data are highly highly skewed.
summary(as.factor(dat$response))
0 1 2 3 4 5 6 7 8 9 10
6 3 6 9 7 6 13 38 49 99 276
Gelman & Hill suggest that even transforming the output variable to a linear variable is not adequate here, and a probit/ordinal regression model would be better. Alas, there is no way of including random slopes via clmm(). An alternative, via G&H, again, is to compare the levels step-by-step with a logistic regression.
None of this directly answers your basic question, sadly. In fact, I found that while mm.2 converges successfully, mm.3 below does not where only the order of the fixed effects is different! My guess here is that the data violate some assumptions, but if there's an alternative answer, I'd be interested.
mm.3 <- lmer(response ~ focus * order + (order * focus | subject) + (order * focus | item), dat)
Philip
On Dec 3, 2013, at 9:35 AM, Daniel Ezra Johnson wrote:
Dear R-Lang,
I have noticed a difference in the random effect results depending on the order of terms in the model, something that (to say the least) I don't think should be happening.
The fixed effects results are identical. This is with lme4_1.0-5.
I have some (simplified) data that you can load as follows:
dat <- read.csv("http://www.danielezrajohnson.com/dej_test.csv")
Briefly, the data has 32 subjects and 32 items. Each subject has four observations of "response" in each of four conditions (focus: "VP" vs. "object", order: "vpo" vs. "vop"), so there are 32 x 16 = 512 observations.
The design is (not perfectly) counterbalanced by Latin Square so that each subject saw 16 items, but the combination of items and conditions was different from subject to subject. Put another way, each of the 32 items is supposed to occur equally in each of the four conditions. This is not exactly true in the example, but I don't think it should be affecting the results.
mm.1 <- lmer(response ~ focus * order + (focus * order | subject) + (focus * order | item), dat, control = lmerControl(optCtrl = list(maxfun = 100000)))
mm.2 <- lmer(response ~ order * focus + (order * focus | subject) + (order * focus | item), dat, control = lmerControl(optCtrl = list(maxfun = 100000)))
> fixef(mm.1)
(Intercept) focusVP ordervpo focusVP:ordervpo
8.7265625 0.3359375 0.1171875 -0.7578125
> fixef(mm.2)
(Intercept) ordervpo focusVP ordervpo:focusVP
8.7265625 0.1171875 0.3359375 -0.7578125
You can see that the fixed effects estimates are EXACTLY the same.
The random effects, however, are somewhat different:
> VarCorr(mm.1)
Groups Name Std.Dev. Corr
subject (Intercept) 1.36674
focusVP 1.02059 -0.808
ordervpo 1.75084 -0.898 0.820
focusVP:ordervpo 2.99477 0.862 -0.930 -0.886
item (Intercept) 0.65516
focusVP 0.78447 -0.749
ordervpo 1.20179 -0.205 0.256
focusVP:ordervpo 1.38629 0.253 -0.063 -0.719
Residual 1.61041
> VarCorr(mm.2)
Groups Name Std.Dev. Corr
subject (Intercept) 1.03365
ordervpo 0.77706 -0.675
focusVP 1.27217 0.542 -0.064
ordervpo:focusVP 1.73912 0.603 -0.124 0.609
item (Intercept) 0.10477
ordervpo 0.92461 1.000
focusVP 0.47122 0.682 0.686
ordervpo:focusVP 1.68445 -0.137 -0.134 0.469
Residual 1.60852
The deviance estimates are also not quite the same.
> deviance(mm.1)
REML
2151.61
> deviance(mm.2)
REML
2175.503
My real question is why the models are not identical. A secondary question is, given that they're not, why are the fixed effects identical, but really I think the fixed effects should be identical, and it's a mystery to me why the random effects are different.
To reiterate, the only difference in the two models is the order in which the two random slopes are entered into the formula.
I hope someone can shed some light onto this, if indeed it hasn't been asked before.
Thanks very much,
Dan
More information about the ling-r-lang-L
mailing list