[R-lang] Re: Main effects of categorical predictors in lmer

Wed Oct 12 00:02:08 PDT 2011

Hi Florian,

Thank you very much. I really appreciate it

Regards,

hossein

On Tue, Oct 11, 2011 at 10:06 PM, T. Florian Jaeger <tiflo@csli.stanford.edu
> wrote:

>
>
> On Tue, Oct 11, 2011 at 7:51 AM, hossein karimi <karimihussein@gmail.com>wrote:
>
>> Hi Florian,
>>
>> Thank you very much for your help. I did what you suggested and got
>> significant main effects for A and B as well as their interaction. Just one
>> question out of curiosity: To obtain main effects, can I also compare a
>> 'base' model (including only random slopes) and then add the predictors one
>> by one to see whether they improve the model significantly? I mean,
>> something like this:
>> l.base = lmer(response ~ (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.A = lmer(response ~ A + (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.AB = lmer(response ~ A+B + (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.full = lmer(response ~ A+B+A:B + (1 | sub) + (1 | item), data,
>> family="binomial")
>>
>> And then, I can go 'anova(I.base, I.A, I.AB, I.full)'. I did this and got
>> the same results. I was just wondering if there is a difference between this
>> and what you suggested
>>
>
> the order in which the models are fitted does not change the results.
>
> B matters if l.AB is better than l.A or if l.full is better than l.AB
> (since then you have at least an interaction, perhaps more).
>
> If you have a main effect or *just *an interaction is harder to test this
> way: you can test whether you have an interaction by comparing l.AB vs.
> l.full, but significance of l.AB compared to l.A or significance of l.B
> compared to l.base does *not *guarantee a 'main effect' in the sense of an
> ANOVA (since B might matter due to the interaction). However, if A and B
> were perfectly balanced and therefore orthogonal in their effects, then
> significance of l.B vs. l.base already shows you that there is a main effect
> of B (same for l.AB vs. l.A).
>
> hth,
>
> Florian
>
>
>
>
> On Mon, Oct 10, 2011 at 9:07 PM, T. Florian Jaeger <
> tiflo@csli.stanford.edu> wrote:
>
>> Hi Hossein,
>>
>> so, let's take this one step at a time. Here's what you can do to answer
>> whether A or B or both have an effect (I am going to assume that the random
>> effect structure you used, (1 + A | sub) + (1 | item), has been shown to be
>> the appropriate one or at least sufficient; if not adapt accordingly):
>>
>> l.full = lmer(response ~ A + B + A:B + (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.AB = lmer(response ~ A + B + (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.A = lmer(response ~ A + (1 + A | sub) + (1 | item), data,
>> family="binomial")
>> l.B = lmer(response ~ B + (1 | sub) + (1 | item), data, family="binomial")
>>
>> if anova(l.full, l.A) is significant, B has an effect (main effect or
>> interaction).
>> if anova(l.full, l.B) is significant, A has an effect (main effect or
>> interaction).
>>
>>  (If there are not significant, non-significance of A or B, respectively,
>> does *not* follow). Given what you've shown us already, both will be
>> significant. Then
>>
>> if anova(l.full, l.AB) is significant, you have a significant interaction
>> of A and B.
>>
>> This will also be the case given what you've shown us so far. You don't
>> technically need to do this. But I wanted to let you know what you would
>> need to do to get the closest equivalent to the so-called omnibus test that
>> the ANOVA would provide.
>>
>> Now, the regression model (if appropriately coded) also gives
>> you interpretable information about the contrasts between the three levels
>> of B. Which coding I would a priori use is hard to tell since you chose to
>> abstract away from your real data (A and B, rather than NP and .... ;)).
>> Btw, it usually helps if you describe your actual data and provide e;g.
>> head(data[,relevantcolumns]).
>>
>> With regard to your email:
>>
>> > contrasts(B)
>>>                  [,1] [,2]
>>> B2               0    1
>>> B1              -1   -1
>>> B3               1    0
>>>
>>> The three levels of B are coded with names rather than numbers and the
>>> baseline condition of B alphabetically comes second; that's why I manually
>>> recoded the matrix so that the baseline receives -1. However, as you can
>>> see, this matrix is different from the one Florian posted. I used the
>>> following code to create the contrasts for B (but I just read Ian's message
>>> which says this is wrong):
>>>
>>> > contrasts (B)<- contr.sum(3)
>>> > contrasts(B) <- cbind(c(0,-1,1),c(1,-1,0))
>>>
>>
>> so here the second row overrides the first! Here's what I recommend:
>>
>> # order of levels argument matters!
>>  B = factor(B, levels = ("B1", "B2", "B3"))
>> contrasts(B) = contr.helmert(3) / 2
>> # dividing by 2 to give you distances of 1 between group means
>>
>> which will give you:
>>
>> > contrasts(B)
>> B1   -.5   -.5
>> B2    .5   -.5
>> B3    0    1
>>
>> which is what you seem to want. Contrast 'B1' will contrast B==2 against
>> the baseline B==1. Contrast 'B2' will contrast B==3 against the mean of B==1
>> and B==2.
>>
>> I recommend you contrast code A:
>>
>> A = factor(A, levels = ("baseline", "notbaseline"))
>> contrasts(B) = contr.sum(2) / 2
>>
>>  > contrasts(A)
>> A1   -.5
>> A2    .5
>>
>> Then l.full (above) should be rather interpretable.
>>
>> Florian
>>
>>>
>>> So, if I want to compare the baseline condition of B with the mean of
>>> other two conditions of B, I need to create contrasts like Florian's? When I
>>> helmet code the levels of B with the baseline receiving 2 and the other two
>>> levels of B receiving -1, I get the following confusing contrasts. I have no
>>> idea what this means. I’m just hoping that I’m comparing the baseline with
>>> the mean of the other B levels:
>>> > contrasts (helmertB)
>>>      2
>>> -1  0
>>> 2   1
>>>
>>>
>>> And here is the complete output of the first model (where A is numerical
>>> and centered):
>>>
>>>
>>> Generalized linear mixed model fit by the Laplace approximation
>>> Formula: response ~ A * B + (A + 1 | sub) + (1 | item)
>>>    Data: mydata
>>>    AIC   BIC logLik deviance
>>>  745.1 794.6 -362.6    725.1
>>> Random effects:
>>>
>>>  Groups Name        Variance Std.Dev. Corr
>>>  item   (Intercept) 0.93903  0.96903
>>>  sub    (Intercept) 2.39079  1.54622
>>>         A                0.34163  0.58449  0.551
>>> Number of obs: 1038, groups: item, 42; sub, 36
>>>
>>> Fixed effects:
>>>                  Estimate Std. Error z value Pr(>|z|)
>>> (Intercept)        1.1661     0.3285   3.550 0.000385 ***
>>> A                     -2.0741     0.1690 -12.270  < 2e-16 ***
>>> B1                   -0.4166     0.1580  -2.636 0.008389 **
>>> B2                    -0.2867     0.1783  -1.608 0.107842
>>> A:B1                 0.9881     0.1594   6.199 5.68e-10 ***
>>> A:B2                 -0.7101     0.1750  -4.059 4.94e-05 ***
>>> ---
>>>
>>> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>
>>> Correlation of Fixed Effects:
>>>                   (Intr)     NP     lngt.1 lngt.2 NP:l.1
>>> A                0.121
>>> B1              -0.173  0.251
>>> B2              0.003  0.060 -0.281
>>> A:B1           0.141 -0.304 -0.259 -0.095
>>> A:B2           0.021  0.056 -0.081 -0.300 -0.334
>>>
>>>  Florian, you already interpreted this output but it was based on the
>>> your assumption of the contrasts matrix for B…
>>>
>>>
>>> By the way, I didn't quite get Ian's suggestion for A. Should I keep it
>>> numerical and centered?
>>>
>>> Many thanks,
>>>
>>> hossein
>>>
>>>
>>> On Mon, Oct 10, 2011 at 6:11 PM, Finlayson, Ian <IFinlayson@qmu.ac.uk>wrote:
>>>
>>>> **
>>>>
>>>> You probably are thinking in the ANOVA way (will leave that for others
>>>> to decide if that's good/bad, appropriate/innappropriate) but anova(model)
>>>> will give you an ANOVA-style table for your model (minus the p values).
>>>>
>>>> Not entirely sure what you mean about having added NP, I assume you
>>>> added it as another fixed effect, but it's not listed in your results, so
>>>> you may have meant something else. If B2 becomes significant after including
>>>> NP then that just suggests to me that NP cleans up noise in your data,
>>>> allowing B2 to become significant (can't help but notice that the p value in
>>>> the first model is only just over .1, so it's not massively far from
>>>> significance).
>>>>
>>>> I should have made the point earlier, but I will now: If you're actually
>>>> interested in whether there's a difference between B1 and B2, and between B1
>>>> and B3, then sum coding isn't appropriate. Sum coding will tell you if
>>>> there's a difference between the levels of the factor and the grand mean. If
>>>> you actually want to compare B1 with B2, and B3, then you'll need to use
>>>> treatment coding. Your model suggests this is experimental data (Subject,
>>>> Item etc.) so I'll assume it's a roughly balanced design and collinearity
>>>> might not be a huge concern.
>>>>
>>>> You're correct that you could use subsetting (dividing the data
>>>> according to A makes sense to me, but that is obviously atheoretical as I
>>>> have no idea what A represents) to get a clear idea of your pattern of
>>>> results. Although, as I've just noticed Florian showing, you can get a good
>>>> picture if you understand the coding and work through the coefficients.
>>>>
>>>> HTH,
>>>>
>>>> Ian
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: hossein karimi [mailto:karimihussein@gmail.com<karimihussein@gmail.com>
>>>> ]
>>>> Sent: Mon 10/10/2011 16:15
>>>> To: Finlayson, Ian
>>>> Cc: ling-r-lang-l@mailman.ucsd.edu
>>>> Subject: Re: [R-lang] Main effects of categorical predictors in lmer
>>>>
>>>> Hi Ian,
>>>>
>>>> Thank you for your help.I did what you suggested and got the following
>>>> output:
>>>> Fixed effects:
>>>>                  Estimate Std. Error z value Pr(>|z|)
>>>> (Intercept)        1.1661     0.3285   3.550 0.000385 ***
>>>> A                    -2.0741     0.1690 -12.270  < 2e-16 ***
>>>> B1                   -0.4166     0.1580  -2.636 0.008389 **
>>>> B2                   -0.2867     0.1783  -1.608 0.107842
>>>> A:B1                 0.9881     0.1594   6.199 5.68e-10 ***
>>>> A:B2                -0.7101     0.1750  -4.059 4.94e-05 ***
>>>>
>>>> I think this is telling me there is a main effect of A, but my problem
>>>> is
>>>> that I also need to report a main effect of B, and that's something I
>>>> don't
>>>> find in this output. I might be thinking in the old ANOVA way, but I
>>>> think I
>>>> should report main effects as well as a main interaction (if there is
>>>> any)
>>>> and then if there's a significant interaction, I should proceed with
>>>> breaking down the main interaction to see where exactly it lies, am I
>>>> right?
>>>> or, are lmer results reported differently?
>>>>
>>>> By the way, I also ran a model with NP as a factor too and got the
>>>> following
>>>> result:
>>>> Fixed effects:
>>>>                   Estimate Std. Error z value Pr(>|z|)
>>>> (Intercept)         0.9371     0.3313   2.829  0.00468 **
>>>> A1                   2.0857     0.1700  12.269  < 2e-16 ***
>>>> B1                   -0.3075     0.1544  -1.991  0.04646 *
>>>> B2                    -0.3651     0.1735  -2.105  0.03533 *
>>>> A1:B1              -0.9936     0.1603  -6.199 5.68e-10 ***
>>>> A1:B2               0.7139     0.1759   4.058 4.96e-05 ***
>>>>
>>>> The effect of B2 is significant in the first model above but not in
>>>> second!!! why is that?
>>>>
>>>> Many thanks,
>>>>
>>>> Hossein
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Oct 10, 2011 at 3:27 PM, Finlayson, Ian <IFinlayson@qmu.ac.uk
>>>> >wrote:
>>>>
>>>> > Hi Hossein,****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > As far as I can see, neither model is correct. If B is a predictor
>>>> with
>>>> > three levels then the model should contain 6 fixed effects: An
>>>> intercept, A,
>>>> > B1 compared to B2, B1 compared to B3, B1 compared to B2 interacting
>>>> with A,
>>>> > and B1 compared to B3 interacting with A.****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > What I suspect you've done is actually enter B as a continuous
>>>> predictor
>>>> > with a mean of zero (since you used scale()). You'll need to keep B as
>>>> a
>>>> > factor, but give it appropriate contrasts:****
>>>> >
>>>> > ** **
>>>> >
>>>> > B <- factor(B) # Make sure it's a factor****
>>>>
>>>> >
>>>> > contrasts(B) <- contr.sum(3) # Generate a 3 level contrast matrix
>>>> using sum
>>>> > coding; alternative contr.helmert(3) for Helmhert coding****
>>>> >
>>>> > ** **
>>>> >
>>>> > Then build your models.****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > It's possible that it won't set the right level as a baseline (the
>>>> baseline
>>>> > will be whichever is a row of -1). If so then you'll need to manually
>>>> set up
>>>> > the matrix. For sum coding, something like:****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > contrasts(B) <- cbind(c(-1,0,1),c(-1,1,0)) # Assuming that B1 is the
>>>> first
>>>> > row in contrasts(B) - double check!****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > I'd also do something similar with A so that your estimates are
>>>> comparable
>>>> > (if you've centered 0 and 1, then it'll come out -.5 and .5, rather
>>>> than -1
>>>> > and 1)****
>>>> >
>>>> > ** **
>>>> >
>>>> > HTH,****
>>>> >
>>>> > ** **
>>>> >
>>>> > Ian****
>>>> >
>>>> > ** **
>>>> >
>>>> > ** **
>>>> >
>>>> > ** **
>>>> >
>>>> > ** **
>>>> >
>>>> > *From:* ling-r-lang-l-bounces@mailman.ucsd.edu [mailto:
>>>> > ling-r-lang-l-bounces@mailman.ucsd.edu] *On Behalf Of *hossein karimi
>>>> > *Sent:* 10 October 2011 15:05
>>>> > *To:* ling-r-lang-l@mailman.ucsd.edu
>>>> > *Subject:* [R-lang] Main effects of categorical predictors in lmer****
>>>> >
>>>> > ** **
>>>> >
>>>> > Dear R users,****
>>>>
>>>> >
>>>> > I'm using mixed effects models (lmer) to predict a binary dependent
>>>> > variable as a function of 1.a categorical predictor (A)with 2 levels
>>>> (A1 and
>>>> > A2) , 2. another categorical predictor (B) with three levels (B1, B2
>>>> and B3)
>>>> > and 3. The interaction between these two predictors. I have tried two
>>>> models
>>>> > but they return different results and I'm not sure which one is
>>>> correct. I'm
>>>> > interested in the main effect of B and the interaction between A and B
>>>> > (because A alone has a significant effect in both models). My problem
>>>> is
>>>> > that there seem to be two sensible ways of examining the main effect
>>>> of B:
>>>> > 1. to helmert code and 2. to center.  But these two methods produce
>>>> opposite
>>>> > results! I don't know which one I should use. Here are the two models
>>>> with
>>>> > some details and their outputs:****
>>>> >
>>>> > ** **
>>>>
>>>> >
>>>> > Model 1: 'A' is centered. 'B' is helmert coded ('B1'(baseline)=2,
>>>> 'B2'=-1,
>>>> > 'B3'=-1) so that I can get a main effect of B by checking to see
>>>> whether
>>>> > baseline condition in B differs from the mean of B1 and B2 . The lmer
>>>> output
>>>> > returns a significant effect of B and no significant AxB interaction.
>>>> > However, as is highlighted below (in pink), the correlation between B
>>>> and
>>>> > the 'AxB' interaction is high (-54%).  ****
>>>> >
>>>> >  ****
>>>>
>>>> >
>>>> > > model.1<-lmer(response~A*B+(A+1|sub)+(1|item), mydata,
>>>> family="binomial")
>>>> > ****
>>>> >
>>>> > > print(model.1)****
>>>> >
>>>> > Generalized linear mixed model fit by the Laplace approximation ****
>>>> >
>>>> > Formula: response ~ A * B+ (A + 1 | sub) + (1 | item) ****
>>>> >
>>>> >    Data: mydata****
>>>> >
>>>> >  AIC   BIC logLik deviance****
>>>> >
>>>> > 783 822.6 -383.5      767****
>>>> >
>>>> > Random effects:****
>>>> >
>>>> > Groups Name        Variance Std.Dev. Corr  ****
>>>> >
>>>> >  item   (Intercept) 0.7293   0.85399        ****
>>>> >
>>>> >  sub    (Intercept) 2.0871   1.44468        ****
>>>> >
>>>> >         A          1.3812   1.17524  0.562 ****
>>>> >
>>>> > Number of obs: 1038, groups: item, 42; sub, 36****
>>>> >
>>>> >  ****
>>>> >
>>>> > Fixed effects:****
>>>> >
>>>> >                   Estimate Std. Error z value Pr(>|z|)    ****
>>>> >
>>>> > (Intercept)        1.05261    0.30283   3.476 0.000509 *******
>>>> >
>>>> > A                -3.91080    0.32239 -12.131  < 2e-16 *******
>>>> >
>>>> > B                  0.36128    0.09751   3.705 0.000211 *******
>>>> >
>>>> > A:B            -0.29638    0.18681  -1.586 0.112626    ****
>>>> >
>>>> > ---****
>>>> >
>>>> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ****
>>>> >
>>>> >  ****
>>>> >
>>>> > Correlation of Fixed Effects:****
>>>> >
>>>> >                         (Intr)   A      B****
>>>> >
>>>> > A                  0.155              ****
>>>> >
>>>> > B                  0.160 -0.278       ****
>>>> >
>>>> > A:B              -0.156  0.238 -0.540****
>>>> >
>>>> >  ****
>>>>
>>>> >
>>>> > Model 2: 'A' and 'B' are both centered. The lmer output returns no
>>>> > significant effect of B but the A:B interaction is significant. The
>>>> > correlations between predictors are generally lower and the
>>>> correlation
>>>> > between B and A:B is reduced to -26%. ****
>>>> >
>>>> > ** **
>>>> >
>>>> > Generalized linear mixed model fit by the Laplace approximation ****
>>>> >
>>>> > Formula: resonse ~ A * B + (A + 1 | sub) + (1 | item) ****
>>>> >
>>>> >    Data: mydata ****
>>>> >
>>>> >    AIC   BIC logLik deviance****
>>>> >
>>>> > 756.1 795.7 -370.1    740.1****
>>>> >
>>>> > Random effects:****
>>>> >
>>>> > Groups Name        Variance Std.Dev. Corr  ****
>>>> >
>>>> >  item   (Intercept) 0.87028  0.93289        ****
>>>> >
>>>> >  sub    (Intercept) 2.41707  1.55469        ****
>>>> >
>>>> >         A         1.23669  1.11206  0.533 ****
>>>> >
>>>> > Number of obs: 1038, groups: item, 42; sub, 36****
>>>> >
>>>> >  ****
>>>> >
>>>> > Fixed effects:****
>>>> >
>>>> >                 Estimate Std. Error z value Pr(>|z|)    ****
>>>> >
>>>> > (Intercept)       1.1004     0.3239   3.398 0.000679 *******
>>>> >
>>>> > A               -4.0941     0.3248 -12.605  < 2e-16 *******
>>>> >
>>>> > B                -0.1461     0.1400  -1.043 0.296851    ****
>>>> >
>>>> > A:B             1.7923     0.2818   6.360 2.01e-10 *******
>>>> >
>>>> > ---****
>>>> >
>>>> > Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ****
>>>> >
>>>> >  ****
>>>> >
>>>> > Correlation of Fixed Effects:****
>>>> >
>>>> >                        (Intr)      A        B****
>>>> >
>>>> > A                    0.138              ****
>>>> >
>>>> > B                   -0.148  0.185       ****
>>>> >
>>>> > A:B                0.106 -0.292 -0.265****
>>>> >
>>>> >  ****
>>>>
>>>> >
>>>> > I personally think Model 2 is better but the thing is that I have
>>>> centered
>>>> > a categorical predictor with *three* levels. In my searches in the
>>>> web, I
>>>>
>>>> > have never seen a three-level predictor to be centered; they were all
>>>> > two-level categorical predictors. ****
>>>>
>>>> >
>>>> > I have used the scale() function to center the predictors (I first
>>>> > converted them to numeric variables and then used the scale ()
>>>> function to
>>>> > center them). As I mentioned, my problem is that I don't know how to
>>>> get a
>>>> > main effect of B as well as a *main* A:B interaction. On the one hand,
>>>> it
>>>>
>>>> > seems logical to compare 'B1' (baseline) with the mean of the other
>>>> two B
>>>> > conditions to see if the B manipulation has a general effect. On the
>>>> other
>>>> > hand, I hear that one needs to center variables to get a main effect.
>>>> ****
>>>> >
>>>> > ** **
>>>> >
>>>> > I would be grateful of you could please help me****
>>>> >
>>>> > ** **
>>>> >
>>>> > Regards,****
>>>> >
>>>> > ** **
>>>> >
>>>> > Hossein****
>>>> >
>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20111012/293eb6fb/attachment-0001.html