From j.tamminen at psych.york.ac.uk Wed Jul 22 02:46:18 2009 From: j.tamminen at psych.york.ac.uk (Jakke Tamminen) Date: Wed, 22 Jul 2009 10:46:18 +0100 Subject: [R-lang] Lmer interactions in factorial designs Message-ID: This is probably a naive question about linear mixed-effects models, but I'm having a lot of trouble getting my head around it. So here we go: Imagine a 2 x 3 factorial design, with Factor A having two levels (A1 and A2) and Factor B having three levels (B1, B2, B3). The dependent variable is reaction time (logRT). If I'm interested in the main effects of A and B, I run the following: lmer(logRT~A+B+(1|Subject)+(1|Item), data) This would give me something along these lines: Fixed effects: Estimate Std. Error t value (Intercept) 6.286297 0.023018 273.11 A2 0.007858 0.004204 1.87 B2 -0.017007 0.003689 -4.61 B3 -0.012179 0.003700 -3.29 If I understand correctly, the model here is evaluating A2 against A1, B2 against B1, and B3 against B1. This leads me to my first question: Is there any way to find out if the main effect of B is significant? Moving on with the same example, assume that I'm also interested in the interaction between A and B. Specifically, I want to find out whether the effect of A differs at the three levels of B. I run the following model: lmer(logRT~A*B+(1|Subject)+(1|Item), data) which would give me something like this: Fixed effects: Estimate Std. Error t value (Intercept) 6.286656 0.023133 271.76 A2 0.007149 0.006009 1.19 B2 -0.013616 0.005211 -2.61 B3 -0.016637 0.005225 -3.18 A2:B2 -0.006842 0.007377 -0.93 A2:B3 0.008973 0.007395 1.21 These are really hard tables to interpret. I believe we are now seeing the difference between A1 and A2 at B1 (0.007149). Furthermore, the last two lines tell us that at B2 the difference needs to be adjusted by -0.006842, and at B3 it needs to be adjusted by 0.008973, and that these adjustments are non-significant. This model doesn't provide information about the main effects. If I wanted to report these, would I refer back to the first model? And my third question: when we do ANOVAs, we're told to first see if the interaction between A and B is significant, and only then look at the interaction contrasts. Lmer in the above table gives you (some of) the contrasts, but doesn't evaluate the interaction as a whole. Do we still need to worry about the interaction as a whole, and if yes, how would we evaluate it? Many thanks in advance! Jakke From Claire.Delleluche at univ-lyon2.fr Fri Jul 24 01:34:04 2009 From: Claire.Delleluche at univ-lyon2.fr (Claire Delle Luche) Date: Fri, 24 Jul 2009 10:34:04 +0200 (CEST) Subject: [R-lang] Interactions in lmer Message-ID: <30501214.2360.1248424445439.JavaMail.root@co7> Dear R users, Dealing with mixed models with a binomial DV and interactions between predictors, I still have a few questions I cannot find the answer to. One of my guideline source for the lmer analysis is the Jaeger and Kuperman WOMM slides. 1- all but one predictor are centered, because the latter is a four level predictor and I am interested in contrasts. Is this correct? Thus I cannot interpret the intercept as the grand mean. Does the intercept has any meaning at all? 2- reporting interactions: as a whole and not just specific contrasts For linear models, there is aovlmer.fnc. Is there such a function for mixed models? 3- residualisation In the best model (var1 is centered, var2 is not as it is a factor), var1(2levels) and var2(4levels) have significant interaction and are correlated (-.491, -.527, -.350 for respective contrasts). Residualisation is a possibility. I was advised to use the following code line, but I get an error I cannot fix: corpus$residinteraction = residuals(lm(I(var1*var2) ~ var1 + var2, data= corpus)) The error diagnostic is about having more than two levels for contrast analysis. Thank you very much in advance. Claire Delle Luche Laboratoire Dynamique du Langage 14, avenue Berthelot 69007 Lyon FRANCE From ches0045 at umn.edu Fri Jul 24 10:53:57 2009 From: ches0045 at umn.edu (Paula Chesley) Date: Fri, 24 Jul 2009 11:53:57 -0600 Subject: [R-lang] wonky PDF labelling behavior in R Message-ID: <154A54B4-B74E-4221-9CE8-65AC4B4246E1@umn.edu> Hi everyone, When trying to plot a pdf of a certain linguistic distribution, I am having trouble having the labels I give as arguments show up in the PDF. What shows up in the PDF are the unhelpful default labels R gives. This is really strange, because I've done the same thing several times no problem, and on the default x11 output the correct labels show up. See attached figure 1 (correct labelling on x11) and figure 2 (unfortunate labelling in PDF). This happens when in both GUI and command-line modes of R. I've shut down other R sessions, and there's no possibility that the file is the wrong file, either. Here is my code for creating the PDF file: > pdf("/Users/pchesley/linguistics/research/memory/data/r/figs/ figure_2.pdf") % attached figure_2.pdf > plot(hist(pct$V1, breaks = seq(0, 1, 0.05)), xlab = "% of speakers having seen new word before", ylab = "number of new words", main=" ") > dev.off() Can anyone tell me what's going on here? I'm flummoxed, and hoping it's not painfully obvious. :) Thanks, Paula -------------- next part -------------- A non-text attachment was scrubbed... Name: figure_1.png Type: application/applefile Size: 72 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: figure_1.png Type: image/png Size: 225251 bytes Desc: not available URL: -------------- next part -------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: figure_2.pdf Type: application/pdf Size: 5795 bytes Desc: not available URL: -------------- next part -------------- From piantado at mit.edu Fri Jul 24 13:38:16 2009 From: piantado at mit.edu (Steven Piantadosi) Date: Fri, 24 Jul 2009 16:38:16 -0400 Subject: [R-lang] wonky PDF labelling behavior in R (Paula Chesley) In-Reply-To: References: Message-ID: <1248467896.30330.7.camel@shannon> Hi Paula, hist itself is a plotting function so you don't need to call plot on it. Try something like pdf("/Users/pchesley/linguistics/research/memory/data/r/figs/ figure_2.pdf") % attached figure_2.pdf hist(pct$V1, breaks = seq(0, 1, 0.05), xlab = "% of speakers having seen new word before", ylab = "number of new words", main=" ") dev.off() hist also returns info about the histogram (counts, the mids of the bars, etc), which you can use to do fancy histogram plots if you want. You can read about them with ?hist. ++Steve On Fri, 2009-07-24 at 13:27 -0700, r-lang-request at ling.ucsd.edu wrote: > Send R-lang mailing list submissions to > r-lang at ling.ucsd.edu > > To subscribe or unsubscribe via the World Wide Web, visit > http://pidgin.ucsd.edu/mailman/listinfo/r-lang > or, via email, send a message with subject or body 'help' to > r-lang-request at ling.ucsd.edu > > You can reach the person managing the list at > r-lang-owner at ling.ucsd.edu > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of R-lang digest..." > > > Today's Topics: > > 1. wonky PDF labelling behavior in R (Paula Chesley) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 24 Jul 2009 11:53:57 -0600 > From: Paula Chesley > Subject: [R-lang] wonky PDF labelling behavior in R > To: r-lang at ling.ucsd.edu > Message-ID: <154A54B4-B74E-4221-9CE8-65AC4B4246E1 at umn.edu> > Content-Type: text/plain; charset="us-ascii"; Format="flowed"; > DelSp="yes" > > Hi everyone, > > When trying to plot a pdf of a certain linguistic distribution, I am > having trouble having the labels I give as arguments show up in the > PDF. What shows up in the PDF are the unhelpful default labels R > gives. This is really strange, because I've done the same thing > several times no problem, and on the default x11 output the correct > labels show up. See attached figure 1 (correct labelling on x11) and > figure 2 (unfortunate labelling in PDF). This happens when in both > GUI and command-line modes of R. I've shut down other R sessions, and > there's no possibility that the file is the wrong file, either. > > Here is my code for creating the PDF file: > > > pdf("/Users/pchesley/linguistics/research/memory/data/r/figs/ > figure_2.pdf") % attached figure_2.pdf > > plot(hist(pct$V1, breaks = seq(0, 1, 0.05)), xlab = "% of speakers > having seen new word before", ylab = "number of new words", main=" ") > > dev.off() > > Can anyone tell me what's going on here? I'm flummoxed, and hoping > it's not painfully obvious. :) > > Thanks, > Paula > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: figure_1.png > Type: application/applefile > Size: 72 bytes > Desc: not available > URL: > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: figure_1.png > Type: image/png > Size: 225251 bytes > Desc: not available > URL: > -------------- next part -------------- > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: figure_2.pdf > Type: application/pdf > Size: 5795 bytes > Desc: not available > URL: > -------------- next part -------------- > > > > > ------------------------------ > > _______________________________________________ > R-lang mailing list > R-lang at ling.ucsd.edu > http://pidgin.ucsd.edu/mailman/listinfo/r-lang > > > End of R-lang Digest, Vol 25, Issue 3 > ************************************* From hjt at stanford.edu Fri Jul 24 23:50:40 2009 From: hjt at stanford.edu (Harry J Tily) Date: Sat, 25 Jul 2009 15:50:40 +0900 Subject: [R-lang] wonky PDF labelling behavior in R In-Reply-To: <154A54B4-B74E-4221-9CE8-65AC4B4246E1@umn.edu> References: <154A54B4-B74E-4221-9CE8-65AC4B4246E1@umn.edu> Message-ID: <1248504640.6810.5.camel@nezumi> Hi Paula I think the problem is that hist() is itself a plotting command -- you don't need to wrap it in plot() at all. It turns out that hist *also* returns a histogram object which you can then plot with plot(), so when you do that the X display immediately replaces the figure created by hist() with the one created by plot()... but the pdf device saves the first figure, the one created directly by hist(). Try passing the graphical parameters (xlab, etc) directly to hist(), or use plot=F as a parameter to hist() if you want to use plot() to do the rendering. Hal -- On Fri, 2009-07-24 at 11:53 -0600, Paula Chesley wrote: > Hi everyone, > > When trying to plot a pdf of a certain linguistic distribution, I am > having trouble having the labels I give as arguments show up in the > PDF. What shows up in the PDF are the unhelpful default labels R > gives. This is really strange, because I've done the same thing > several times no problem, and on the default x11 output the correct > labels show up. See attached figure 1 (correct labelling on x11) and > figure 2 (unfortunate labelling in PDF). This happens when in both > GUI and command-line modes of R. I've shut down other R sessions, and > there's no possibility that the file is the wrong file, either. > > Here is my code for creating the PDF file: > > > pdf("/Users/pchesley/linguistics/research/memory/data/r/figs/ > figure_2.pdf") % attached figure_2.pdf > > plot(hist(pct$V1, breaks = seq(0, 1, 0.05)), xlab = "% of speakers > having seen new word before", ylab = "number of new words", main=" ") > > dev.off() > > Can anyone tell me what's going on here? I'm flummoxed, and hoping > it's not painfully obvious. :) > > Thanks, > Paula > > > > > _______________________________________________ > R-lang mailing list > R-lang at ling.ucsd.edu > http://pidgin.ucsd.edu/mailman/listinfo/r-lang From tiflo at csli.stanford.edu Sat Jul 25 14:48:09 2009 From: tiflo at csli.stanford.edu (T. Florian Jaeger) Date: Sat, 25 Jul 2009 14:48:09 -0700 Subject: [R-lang] Lmer interactions in factorial designs In-Reply-To: References: Message-ID: <38dc9be90907251448s4b890ad7y4f34429d1ff128e2@mail.gmail.com> Dear Jakke, my answers are inserted below. Imagine a 2 x 3 factorial design, with Factor A having two levels (A1 and > A2) and Factor B having three levels (B1, B2, B3). The dependent variable > is > reaction time (logRT). > > If I'm interested in the main effects of A and B, I run the following: > > lmer(logRT~A+B+(1|Subject)+(1|Item), data) > > This would give me something along these lines: > > Fixed effects: > Estimate Std. Error t value > (Intercept) 6.286297 0.023018 273.11 > A2 0.007858 0.004204 1.87 > B2 -0.017007 0.003689 -4.61 > B3 -0.012179 0.003700 -3.29 > > If I understand correctly, the model here is evaluating A2 against A1, B2 > against B1, and B3 against B1. This leads me to my first question: Is there > any way to find out if the main effect of B is significant? Do I understand correctly that you want an omnibus test assessing whether B contributes significant information to the model? Is that what you mean by the "main effect" of B? If so, you need to do model comparison of this model against a model without B. Fit both models with method="ML" (see Baayen et al., 2008-JML for elaboration). Though I think if you create the two models and use anova() to compare them, anova() does the right thing anyway. so l <- lmer(logRT~A+B+(1|Subject)+(1| Item), data) l.woB <- lmer(logRT~A+(1|Subject)+(1| Item), data) anova(l, l.woB) should do the job. Note that I would also at least test whether random slopes for A+B. Moving on with the same example, assume that I'm also interested in the > interaction between A and B. Specifically, I want to find out whether the > effect of A differs at the three levels of B. I run the following model: > > lmer(logRT~A*B+(1|Subject)+(1|Item), data) > > which would give me something like this: > > Fixed effects: > Estimate Std. Error t value > (Intercept) 6.286656 0.023133 271.76 > A2 0.007149 0.006009 1.19 > B2 -0.013616 0.005211 -2.61 > B3 -0.016637 0.005225 -3.18 > A2:B2 -0.006842 0.007377 -0.93 > A2:B3 0.008973 0.007395 1.21 > > These are really hard tables to interpret. I believe we are now seeing the > difference between A1 and A2 at B1 (0.007149). Furthermore, the last two > lines tell us that at B2 the difference needs to be adjusted by -0.006842, > and at B3 it needs to be adjusted by 0.008973, and that these adjustments > are non-significant. This model doesn't provide information about the main > effects. Be cautious with the interpretation of A and B's contrast coefficients since there may be collinearity in the model (especially when you include the interaction of A and B). Have you checked the fixed effect correlations? I recommend reading Baayen et al., 2008 and maybe browse through Baayen's book. Also, check out Victor Kuperman and my slides for WOMM ( http://hlplab.wordpress.com/2009-pre-cuny-workshop-on-ordinary-and-multilevel-models-womm/). These slides cover what you need to do about collinearity (as well as what that is to begin with ;). > If I wanted to report these, would I refer back to the first model? You report everything from the last model. If you use treatment coding of factors (R default) then what you called main effects actually are not main effects. They are simple effects. To get main effects as in ANOVAs, you should contrast code (contrast.sum()) the factors. There are some commented R scripts on coding on our lab wiki ( http://wiki.bcs.rochester.edu:2525/HlpLab/StatsCourses/HLPMiniCourse). Conveniently, contrast-coding with also deal with collinearity between main effects and interactions if you have balanced data. > And my third question: when we do ANOVAs, we're told to first see if the > interaction between A and B is significant, and only then look at the > interaction contrasts. Lmer in the above table gives you (some of) the > contrasts, but doesn't evaluate the interaction as a whole. Do we still > need > to worry about the interaction as a whole, and if yes, how would we > evaluate > it? If you want to follow ANOVA logic, do model comparison. start with the full model and then do stepwise removal. For a balanced data set, this procedure basically brings you back to ANOVA-land ;) -- while still taking advantage of mixed models (relaxed assumptions, etc.). So, start with a full model: 1) l <- lmer(logRT~A*B+(1+A*B|Subject)+(1+A*B| Item), data) 2) follow the procedure outline on our lab blog to figure out which random effects you need: http://hlplab.wordpress.com/2009/05/14/random-effect-should-i-stay-or-should-i-go/ 3) take the resulting model and compare it against a model without the interaction, using anova(l, l.woInteraction). 4) *if removal of the interaction is not significant*, you could further compare the model against a model with only A (see above). 5) Interpret coefficients in the full model or in the reduced model (I would do the former unless I don't have much data or cannot reduce collinearity, but you may prefer the latter). 6) If you find any of the scripts of references given above useful, cite/refer to them, so that others can find them ;) HTH, Florian > > > Many thanks in advance! > > Jakke > > > _______________________________________________ > R-lang mailing list > R-lang at ling.ucsd.edu > http://pidgin.ucsd.edu/mailman/listinfo/r-lang > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tiflo at csli.stanford.edu Sat Jul 25 15:15:28 2009 From: tiflo at csli.stanford.edu (T. Florian Jaeger) Date: Sat, 25 Jul 2009 15:15:28 -0700 Subject: [R-lang] Interactions in lmer In-Reply-To: <30501214.2360.1248424445439.JavaMail.root@co7> References: <30501214.2360.1248424445439.JavaMail.root@co7> Message-ID: <38dc9be90907251515j2582cea5nbfbe85be6c404eeb@mail.gmail.com> On Fri, Jul 24, 2009 at 1:34 AM, Claire Delle Luche < Claire.Delleluche at univ-lyon2.fr> wrote: > Dear R users, > > Dealing with mixed models with a binomial DV and interactions between > predictors, I still have a few questions I cannot find the answer to. > One of my guideline source for the lmer analysis is the Jaeger and Kuperman > WOMM slides. > > 1- all but one predictor are centered, because the latter is a four level > predictor and I am interested in contrasts. Is this correct? Thus I cannot > interpret the intercept as the grand mean. Does the intercept has any > meaning at all? The intercept always has the meaning of "everything else is 0" --> when the sum of all other beta * predictors is 0 (e.g. when all other predictors are 0), then the linear predictor is the intercept. So, if you have a balanced sample and the all predictors are contrast coded except for one 4 level predictor, which is treatment coded, then the intercept corresponds to the mean of the reference condition of the 4-level predictor. 2- reporting interactions: as a whole and not just specific contrasts > For linear models, there is aovlmer.fnc. Is there such a function for mixed > models? aovlmer.fnc is for lmer (=mixed) linear models. Do you mean mixed logit models? You can always do the same thing yourself by comparing the deviance of *nested* models against a chi-square distribution with df1-df2 degrees of freedom (the difference in number of parameters in the two models). > > > 3- residualisation > In the best model (var1 is centered, var2 is not as it is a factor), > var1(2levels) and var2(4levels) have significant interaction and are > correlated (-.491, -.527, -.350 for respective contrasts). > Residualisation is a possibility. > I was advised to use the following code line, but I get an error I cannot > fix: > > corpus$residinteraction = residuals(lm(I(var1*var2) ~ var1 + var2, data= > corpus)) > > The error diagnostic is about having more than two levels for contrast > analysis. in order to make var1*var2 a continuous outcome (expected by lm()) you need to manually recode the factors in to k-1 numerical predictors where k is the number of levels in the predictor. I suspect that your error message is linked to this problem. HTH, Florian > > > > Thank you very much in advance. > > Claire Delle Luche > Laboratoire Dynamique du Langage > 14, avenue Berthelot > 69007 Lyon FRANCE > > _______________________________________________ > R-lang mailing list > R-lang at ling.ucsd.edu > http://pidgin.ucsd.edu/mailman/listinfo/r-lang > -------------- next part -------------- An HTML attachment was scrubbed... URL: From j.tamminen at psych.york.ac.uk Thu Jul 30 02:59:06 2009 From: j.tamminen at psych.york.ac.uk (Jakke Tamminen) Date: Thu, 30 Jul 2009 10:59:06 +0100 Subject: [R-lang] Lmer interactions in factorial designs In-Reply-To: <38dc9be90907251448s4b890ad7y4f34429d1ff128e2@mail.gmail.com> References: <38dc9be90907251448s4b890ad7y4f34429d1ff128e2@mail.gmail.com> Message-ID: <10E5F039ABEA4664BE0B6FD270FF439A@psych.york.ac.uk> My thanks to Andy, James, and Florian for their responses to my question. The replies were, as always, prompt, helpful, and lucid. I have a couple of quick further questions about model comparison: I think all three replies included suggestions of doing likelihood ratio tests to assess the significance of a single fixed factor in the model. How reliable is this? As far as I can recall, Baayen in his book and in the JML paper only uses this to evaluate random factors, and the paper by Bolker et al that Andy cited recommends against it in the case of fixed factors. Are there good alternatives? Finally, a quick follow up question regarding Florian's six-step procedure, reproduced below. In step 5 you suggest I interpret the coefficients in the full _or_ the reduced model. So is it acceptable to look at the coefficients of a factor or an interaction even if the factor or interaction does not "survive" a likelihood ratio test, i.e. does not significantly contribute to the fit of the model? I hope that makes sense, thank you again for all the help! Jakke 1) l <- lmer(logRT~A*B+(1+A*B|Subject)+(1+A*B| Item), data) 2) follow the procedure outline on our lab blog to figure out which random effects you need: http://hlplab.wordpress.com/2009/05/14/random-effect-should-i-stay-or-should -i-go/ 3) take the resulting model and compare it against a model without the interaction, using anova(l, l.woInteraction). 4) if removal of the interaction is not significant, you could further compare the model against a model with only A (see above). 5) Interpret coefficients in the full model or in the reduced model (I would do the former unless I don't have much data or cannot reduce collinearity, but you may prefer the latter). 6) If you find any of the scripts of references given above useful, cite/refer to them, so that others can find them ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tiflo at csli.stanford.edu Fri Jul 31 13:16:41 2009 From: tiflo at csli.stanford.edu (T. Florian Jaeger) Date: Fri, 31 Jul 2009 16:16:41 -0400 Subject: [R-lang] Lmer interactions in factorial designs In-Reply-To: <10E5F039ABEA4664BE0B6FD270FF439A@psych.york.ac.uk> References: <38dc9be90907251448s4b890ad7y4f34429d1ff128e2@mail.gmail.com> <10E5F039ABEA4664BE0B6FD270FF439A@psych.york.ac.uk> Message-ID: <38dc9be90907311316i1efd7fe7jb966d8d3b72e9053@mail.gmail.com> On Thu, Jul 30, 2009 at 5:59 AM, Jakke Tamminen wrote: > My thanks to Andy, James, and Florian for their responses to my question. > The replies were, as always, prompt, helpful, and lucid. I have a couple of > quick further questions about model comparison: I think all three replies > included suggestions of doing likelihood ratio tests to assess the > significance of a single fixed factor in the model. How reliable is this? As > far as I can recall, Baayen in his book and in the JML paper only uses this > to evaluate random factors, and the paper by Bolker et al that Andy cited > recommends against it in the case of fixed factors. Are there good > alternatives? > It's still being debated what's best to be done there, but I think it's a valid alternative for now and especially so for simple models. > > Finally, a quick follow up question regarding Florian's six-step procedure, > reproduced below. In step 5 you suggest I interpret the coefficients in the > full _or_ the reduced model. So is it acceptable to look at the coefficients > of a factor or an interaction even if the factor or interaction does not > "survive" a likelihood ratio test, i.e. does not significantly contribute to > the fit of the model? > I would usually leave non-significant predictors in the model *if they are theoretically motivated *(which is why they should be why you put them in there to begin with ;)). There are many different traditions and approaches, but I feel that, if you have enough data to avoid overfitting or other problems, you should leave even relatively insignificant predictors into the model (p>.7 [sic] is often given as a removal threshold). HTH, Florian > I hope that makes sense, thank you again for all the help! > > Jakke > > > 1) l <- lmer(logRT~A*B+(1+A*B|Subject)+(1+A*B| Item), data) > 2) follow the procedure outline on our lab blog to figure out which random > effects you need: > http://hlplab.wordpress.com/2009/05/14/random-effect-should-i-stay-or-should-i-go/ > 3) take the resulting model and compare it against a model without the > interaction, using anova(l, l.woInteraction). > 4) *if removal of the interaction is not significant*, you could further > compare the model against a model with only A (see above). > 5) Interpret coefficients in the full model or in the reduced model (I > would do the former unless I don't have much data or cannot reduce > collinearity, but you may prefer the latter). > 6) If you find any of the scripts of references given above useful, > cite/refer to them, so that others can find them ;) > > > _______________________________________________ > R-lang mailing list > R-lang at ling.ucsd.edu > http://pidgin.ucsd.edu/mailman/listinfo/r-lang > > -------------- next part -------------- An HTML attachment was scrubbed... URL: