Hi Peter-ling,<br><br>can you say a bit more about your data? How many data points each are there within A-D, A-E, A-F, B-D, B-E, and B-F? If I had to guess I would say you have either more or less data in D than in E or F (depending on how you contrast coded)? Is that the way your data is unbalanced? A vs B seems balanced and I would even guess that the distribution of D vs. E vs. F is the same in A and B, but the distribution of D vs. E vs. F overall is unbalanced?<br>
<br>Anyway, for certain distributions of data, you will find that contrast coding won't just solve the problem. If you play around with the following simulation, this becomes quite apparent:<br><br>====================================================<br>
# number of subjects<br>n_s = 20<br># number of items within subjects<br>n_per_s = 16<br><br># let's create an unbalanced data set of the type that I suspect you have<br># (see above)<br>AvB <- as.factor(rep(c(rep("a", 8), rep("b", 8)), n_s))<br>
DvEvF <- as.factor(rep(c("d","d","d", "d", "e", "e", "f", "f"), n_s * 2))<br># subjects<br>s <- sort(rep(seq(1:20), n_per_s))<br># subject effects<br>
rs <- rnorm(n_s,0,0.5)<br><br># clumsy way to encode fixed effects<br>effect <- function(x) {<br> ifelse(x == "a", 2,<br> ifelse(x == "b", 1.2,<br> ifelse(x == "d", 0.5,<br>
ifelse(x == "e", -1.5,<br> ifelse(x == "f", -0.3, NA)<br> )<br> )<br> )<br> )<br>}<br># outcome is specified by effects of AvB and DvEvF (no interaction, but<br># feel free to add one) plus noise plus subject effects (also noise).<br>
y <- effect(AvB) + effect(DvEvF) + rnorm(n,0,0.2) + rs[s]<br><br>library(lme4)<br><br># treatment coding<br>contrasts(DvEvF) <- contr.treatment(3)<br>lmer(y ~ AvB + DvEvF + (1 | s))<br><br># contrast (sum) coding<br>
contrasts(DvEvF) <- contr.sum(3)<br>lmer(y ~ AvB + DvEvF + (1 | s))<br><br># or with interactions:<br>contrasts(AvB) <- contr.sum(2)<br>contrasts(DvEvF) <- contr.sum(3)<br>lmer(y ~ AvB * DvEvF + (1 | s))<br><br>
# contrast coding as used by you<br>
# to see that the above contrast coding does the same as<br># what you do<br>DvEvF <- factor(DvEvF, levels=c("e", "d", "f"))<br>contrasts(DvEvF) <- contr.sum(3)<br>lmer(y ~ AvB + DvEvF + (1 | s))<br>
====================================================<br><br>this gives you something very similar to your pattern of fixed effect correlations. But play around with other parameters to get a feel.<br><br>what can you do now? First, many people would consider a fixed correlation of < .5 to be reason for caution but not to disregard the results. In the corpus work I do, I usually try to keep fixed effect correlations much smaller, but I think folks find .5 acceptable, but maybe people can chime in and let you know whether I am off on that. <br>
<br>any method that further reduced collinearity will require some decisions (e.g. what to residualize against what; PCA; or some alternative coding). For example, helmert coding works quite well for the pseudo data set mentioned above, if you hypothesis was that e < f < d (and you resort the factor levels accordingly). Fixed effect correlations would all be <0.26.<br>
<br>HTH,<br>Florian<br><br><div class="gmail_quote">On Thu, Oct 1, 2009 at 11:11 AM, Peter Graff <span dir="ltr"><<a href="mailto:graff@mit.edu">graff@mit.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div bgcolor="#ffffff" text="#000000">
Dear R-Langs,<br>
<br>
I'm currently trying to analyze some experimental data with a contrast
coded mixed model. I have a 2 (levels:A,B) x 3 (levels:D,E,F) design
with unbalanced cell sizes. I am coding the following variables:<br>
<br>
AvB, DvE, EvF<br>
<br>
where sum(AvB)=0, sum(DvE)=0, sum(EvF)=0. Next I fit this model:<br>
<br>
lmer(DV~AvB*(DvE+EvF)+(1|Subj)+(1|Item))<br>
<br>
And here is the correlation matrix it outputs:<br>
<br>
Correlation of Fixed Effects:<br>
(Intr) AvB DvE EvF AvB:DvE<br>
AvB -0.029 <br>
DvE 0.010 -0.006 <br>
EvF -0.002 -0.001 <b><font color="#cc0000">0.488</font> </b>
<br>
AvB:DvE -0.003 0.015 -0.039 -0.018 <br>
AvB:EvF -0.001 -0.002 -0.018 -0.044 <font color="#000099"><b>0.488 </b></font><br>
<br>
The question I have is, how to get rid of the collinearity in red and
blue and whether it's even possible. And if it's not possible, in what
way will this affect the reliability of my result?<br>
<br>
Thanks so much in advance,<br>
<br>
Peter<br>
</div>
<br>_______________________________________________<br>
R-lang mailing list<br>
<a href="mailto:R-lang@ling.ucsd.edu">R-lang@ling.ucsd.edu</a><br>
<a href="http://pidgin.ucsd.edu/mailman/listinfo/r-lang" target="_blank">http://pidgin.ucsd.edu/mailman/listinfo/r-lang</a><br>
<br></blockquote></div><br>