[R-lang] Re: is it possible to use lmer function with a polytomous response variable?

Mon Apr 12 00:05:30 PDT 2010

Dear all,

The response of gurcanli@cogsci.jhu.edu in turn reminded me of an 
analysis of multiple types of mispronunciations (completed, 
interrupted, competing, other mispron, no mispron), which we modeled 
by means of multinomial regression using function multinom from the 
nnet package, using 2x2 fixed predictors.

The random effects of items was imitated by means of two-stage 
bootstrap, i.e. first bootstrapping over the random effect (items) 
and then over the residuals (responses). A condensed version of the 
R script used for our data analysis is included below, with 
apologies for the somewhat clunky programming style.

For more details and references see:
# S.G. Nooteboom & H. Quené (2008). Self-monitoring and feedback: a 
new attempt to find the main cause of lexical bias in phonological 
speech errors. Journal of Memory and Language, 58 (3), 837-861. 
[doi:10.1016/j.jml.2007.05.003].
http://www.let.uu.nl/~Sieb.Nooteboom/personal/SelfMon&Feedbprf.pdf

With kind regards, Hugo Quené

# bootstrap simulation of mixed-effects multinomial logistic regression
# Hugo Quené, 23 November 2006
#
# bootstrap over pairs=items, not participants,
# because between-items var happens to be larger than between-subj var
#
# for further description and references, see
# S.G. Nooteboom & H. Quené (2008). Self-monitoring and feedback: a 
new attempt to find the main cause of lexical bias in phonological 
speech errors. Journal of Memory and Language, 58 (3), 837-861. 
[doi:10.1016/j.jml.2007.05.003].
#
# initializations
# use data=slip2
require(nnet) # needed for function multinom
items <- sort(unique(slip2$paar))
nitem <- length(items)
set.seed(1981) # random number seed, for reproduction
niter <- 10 # adjust as desired, niter=250 recommended
# multinom logistic regression will yield 5*3 coefs
results2.coef <- matrix(NA,ncol=15) # matrix for resulting coefs
#
# loop for bootstrap iterations starts here
for (iter in 1:niter) {
	# stage one: perform bootstrap over pairs of items
	these.items <- sort(items[resample(1:nitems,size=nitems-1)]) # 
bootstrap resampling stage 1
	# see Eq.6.29, p.248

	# construct data set, containing all data from items sampled in 
these.items.
	# the number of observations per item varies among items.
	bootdata <- matrix( NA,ncol=16 )	# slip2 has 16 columns
	dimnames(bootdata)[[2]] <- dimnames(slip2)[[2]] # because rbind 
wants dimnames to match
	for (ip in 1:nitem-1) { # note that length(these.items)==(nitem-1)
		# take rows of corresponding item, and rbind these to temp matrix 
called bootdata
		bootdata <- rbind( bootdata, as.vector( 
slip2[slip2$item==these.items[ip],]) ) }

	# stage two: perform bootstrap over bootdata
	nrows <- dim(bootdata)[1]
	# remove first line of NA's from bootdata, before resampling
	bootdata <- bootdata[2:nrows,]
	nrows <- nrows-1
	these.rows <- sort(resample(1:nrows,size=nrows)) # bootstrap 
resampling stage 2
	bootdata.rs <- bootdata[these.rows,]
	# add fresh predictors N1,N2,L1,L2, extracted from *resampled* data set
	N1 <- as.factor(bootdata.rs$lexi==0 & bootdata.rs$feat2==1)
	N2 <- as.factor(bootdata.rs$lexi==0 & bootdata.rs$feat2==2)
	L1 <- as.factor(bootdata.rs$lexi==1 & bootdata.rs$feat2==1)
	L2 <- as.factor(bootdata.rs$lexi==1 & bootdata.rs$feat2==2)

	# perform multinom regression
	mu1 <- multinom(type ~ 1 + N1+N2+L1+L2,data=bootdata.rs) # multinom
	# rbind (i.e. append) to results matrix
	results2.coef <- rbind(results2.coef,as.vector(coef(mu1)))	# length 
of added vector is 15

	# wrap-up after each iteration
	rm(N1,N2,L1,L2) # cleanup before next iteration
} # loop for bootstrap iteration ends here

# show results
results2new.coef <- results2.coef[2:(niter+1),] # discard first row 
of NAs

On 2010.04.10 23:26 , Antti Arppe wrote:
> Dear all,
>
> On Thu, 8 Apr 2010, gurcanli@cogsci.jhu.edu wrote:
>> In my data I have a polytomous response variable which has 4 levels (4
>> different verb categories). I want to analyze the data by using lmer
>> function with fixed and random effects. Is it possible? If not, what other
>> package can I use? Since I am new to this type of analysis, I would
>> appreciate if you could give the details of the R-code.
>
> The four verb categories you mention cannot but remind me of the four
> synonymous (Finnish) verbs that I modeled with logistic regression in
> my doctoral dissertation - though strictly speaking in terms of
> linguistic explananatory variables as fixed effects (with authors as
> clusters in a bootstrap validation of the model masquerading as the
> impact of random effects). As a prelude (for some possible solutions)
> to your mixed effects logistic regression modeling, the various
> different heuristics for implementing this are described on pp.
> 113-116 and 119-125 in the published dissertation (to a certaint
> extent based on a paper by Eibe Frank and Stefan Kramer, 2004:
> Ensembles of nested dichotomies for multi-class problems, ACM
> International Conference Proceeding Series; Vol. 69), anyhow:
>
> http://www.ling.helsinki.fi/~aarppe/Publications/Arppe_Dissertation_Final_Print.pdf
> [the entire dissertation electronically published as:
> http://urn.fi/URN:ISBN:978-952-10-5175-3]
>
> A brief overview of these heuristics can be found in a recent
> presentation:
>
> http://www.ling.helsinki.fi/~aarppe/Publications/Alberta_PLR_Arppe_100226.pdf
>

-- 
Dr Hugo Quené | Utrecht inst voor Linguïstiek OTS | Universiteit 
Utrecht | Trans 10 | 3512 JK Utrecht | The Netherlands | T +31 30 
253 6070 | F +31 30 253 6000 | H.Quene@uu.nl | www.hugoquene.nl | 
www.hum.uu.nl