[R-lang] Binomial data using mixed-designs model

Fri Jul 2 02:37:40 PDT 2010

Hi dear R-lang users,

I have a set of binomial data that I would like to use lmer() to analyze.
The data are about participants' choices of stress for two-word compound
words (e.g. blackboard) - specifically, whether participants chose to stress
the first word or the second word of a compound word. There are three groups
of words, and all three groups have an equal number of words, and each
group's words are different. Each participant sees all three lists.

e.g.
*Familiarity rating is on a scale of 1-7 -> 7 being very familiar and 1
being not familiar at all.

subject  group  word                N1     Familiarity
1             G1     blackboard         1      7
1             G1     checkbook         1      7
*.               .      .                        .       .*
*.               .      .                        .       .*
*.               .      .                        . **      .*
1             G2      roof-mounted      0      5
*.               .      .                        .       .*
*.               .      .                        .       .*
*.               .      .                        . **      .*

I made the following model:

>cp.lmer1<-lmer(N1~group*FamiliarityRating + (1|subject) + (1|word),
data=cp, family = "binomial")

I have the following questions:

(1). First of all, after I ran str(cp), the result shows that
FamiliarityRating was considered to be int(ternals?). Should I change it to
factors using as.factor, or 'int' is good?

(2). Second, I would like to know whether the fit of the model looks right.

(3). Third, maybe my understanding of lmer with binomial data is not
correct, but I thought that if I used "binomial" family to fit the model,
lmer() would use one level as the control (e.g. G1 of my data) and compare
all the other levels with this control level (e.g. G1 vs. G2, G1 vs. G3, but
crucially *not* G2 vs. G3). (the results I obtained by using lmer() are
below)

Generalized linear mixed model fit by the Laplace approximation
Formula: N1 ~ group * Familiarity + (1 | Participant) + (1 | Word)
   Data: cp
   AIC BIC logLik deviance
 230.4 262 -107.2    214.4
Random effects:
 Groups      Name        Variance Std.Dev.
 Word        (Intercept) 2.73735  1.65449
 Participant (Intercept) 0.60344  0.77681
Number of obs: 384, groups: Word, 24; Participant, 16

Fixed effects:
                         Estimate Std. Error z value Pr(>|z|)
(Intercept)              -1.03647    0.79997  -1.296 0.195100
groupG2               19.90576 3142.42809   0.006 0.994946
groupG3                  3.69099    1.10661   3.335 0.000852 ***
Familiarity               -0.08980    0.13171  -0.682 0.495336
groupG2:Familiarity    0.10600  384.87250   0.000 0.999780
groupG3:Familiarity      0.07617    0.18065   0.422 0.673302
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
            (Intr) grpgrn groprd Fmlrty grpg:F
groupG2      0.000
groupG3      -0.663  0.000
Familiarity -0.525  0.000  0.344
grpG2:Fmlr  0.000 -0.938  0.000  0.000
grpG3:Fmlrt  0.346  0.000 -0.530 -0.659  0.000

Since I am interested in all-pairwise comparisons (namely,  G1 vs. G2, G1
vs. G3, and G2 vs. G3), I thought I should do post-hoc tests. So, I resorted
to the 'multcomp' package by doing the following:

>summary(glht(cp.lmer1, linfct=mcp(group = "Tukey")))

Is it an appropriate method?

(4). Lastly, after I ran the post-hoc test, I got an error message

         Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts

Fit: glmer(formula = N1 ~ group * Familiarity + (1 | Participant) +
    (1 | Word), data = cp, family = "binomial")

Linear Hypotheses:
                  Estimate Std. Error z value Pr(>|z|)
G1 - G2 == 0   19.906   3142.428   0.006  0.99997
G1 - G3 == 0      3.691      1.107   3.335  0.00170 **
G2 - G3 == 0   -16.215   3142.428  -0.005  0.99998
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

Warning message:
In mcp2matrix(model, linfct = linfct) :
  covariate interactions found -- default contrast might be inappropriate

So I wonder if there is anyway to avoid this covariate interaction problem?

Thank you in advance for your help!

Xiao He
Graduate student
Department of Linguistics
University of Southern California
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100702/2e0c24f5/attachment.html