[R-lang] Generalized linear mixed models

Tue Jun 5 07:59:34 PDT 2007

Hi all,

I've recently been exploring beyond my established comfort zone with
mixed models, and am looking for some correction or reassurance.  I am
working with experimental data on social perceptions of linguistic
variation.  I've got two types of dependent variables: ratings on a 6
point scale (e.g. not at all intelligent-very intelligent), which I've
been treating as linear variables and binary variables, based on
whether a given term was selected as a good description of a speaker
(e.g. hardworking).

The independent variables (well, some of them) were:
speaker (8)
recording (nested, 4 for each speaker) -- which recording was being responded to
(ING) (3) -- crossed with recording, indicates which guise of the
variable (ING) was used (e.g. working or workin')
two measures of listener mood pleasant and arousal

The structure of the experiment was such that every subject heard one
recording (which represented also one (ING) guise) from each speaker.

In the past with similar data, I have been using nlme for linear mixed
models, and using subject id as a random effect. (ING) effects and the
interaction of (ING) with the other variables, such as speaker, is the
main point of interest. I have two questions.

1) Is it more appropriate to build in both subject id and the
recording choice as random effects, rather than only including just
the subject id?  I am treating speakers as fixed effects,
deliberately-- I have no expectation that these particular speakers
are representative of anyone except themselves.  But the recordings
within each speaker were randomly assigned to listeners.

2) When doing an analysis of the binary variables, how can I tell
whether overdispersion and/or zero-inflation is an issue for me?

Bringing these two questions together, I have been looking at using
lmer for both the "linear" and the binary variables, with something
like these:

lmer(intellect~speaker*ining*(pleasant_mood+mood_arousal)+(1|subject_id)+(1|recording),
data=whitenoise)

lmer(hardworking~speaker*ining*(pleasant_mood+mood_arousal)+(1|subject_id)+(1|recording),
 family = binomial, data=whitenoise, method="AGQ")

Does this make sense, do I need the "recording" term?  And how can I
determine if I need to be concerned about zero-inflation and if so, is
glmmADMB my only option for the binary variables (a pain, since I
mostly use Macs)?

Many thanks,

Kathryn