[R-lang] (Mis)using MCMCglmm

Sun Dec 6 07:42:03 PST 2009

Dear R-lang users,

I am trying to use the MCMCglmm package to analyze some data relating to prosody. Although I am not a newbie to R, it's fair to say that I am incompetent regarding mixed models. So, pardon the naïvety of my questions. For convenience, I divide them into two different parts, which, I think, can receive partly independent answers.

1. Is MCMCglmm appropriate for what I am trying to do?

22 subjects were asked to categorize 16 sentences, assigning to them one of four labels A(ssertion), Q(uestion), E(xclamation), ind(eterminate). This is my response variable (categorical-nominal, 4 levels). The experimenters had assigned to each sentence one of four labels A(ssertion), Q(uestion), E(xclamation), C(ontinuation). This is my independent variable (categorical-nominal, four levels). I projected the results into a  matrix of length (22 X 16), as follows:

	subj       pair.type response
1 	subj1         A        	A
2 	subj1         E       	E
3 	subj1         C      	ind
4 	subj1         Q        	Q
5 	subj1         Q       	Q
6 	subj1         E        	E 
...

347 subj22         C        Q
348 subj22         A        A
349 subj22         Q        Q
350 subj22         E        	E
351 subj22         C       	E
352 subj22         A        	A

I would like to see if there is an influence of the pair.type. Intuitively, continuatives are massively interpreted as questions by subjects. For instance, clustering on answers returns a single cluster for questions and continuatives. 

I ran a logistic regression exploration using VGAM. The results fit intuition, but, of course, with pseudo-replication (each subject enters the scene 16 times). My idea was to use a mixed model integrating random variation on subjects. Does it make sense in the case at hand and does it make sense with very small data like those ones?

2. Parameterizing the MCMCglmm call

Borrowing alternatively from Florian Jaeger blog page (http://hlplab.wordpress.com/2009/05/07/multinomial-random-effects-models-in-r/#more-378) and the tutorial doc for MCMClgmm, I tried the following:

k = 4; I <- diag(k-1);  J <- matrix(rep(1, (k-1)^2), c(k-1, k-1));
prior <- list(R=list(fix=1, V=0.25 * (I + J), n = 3),G=list(G1=list(V=diag(4),n=4),G2=list(V=diag(4),n=4)))
m <- MCMCglmm(response ~ -1+trait+pair.type, random = ~us(trait):subj+us(pair.type):subj, rcov= ~us(trait):units, family = "categorical",burnin=15000,nitt=40000,data = data, prior = prior, verbose = T)

The package complains that V is the wrong dimension ''for some priorG/priorR elements"

I tried a number of variants (on R, G and the fixed vs. random effect distribution), but, obviously, I am stuck ... and unsure about whether my attempt to use subjects as random effect carriers makes sense in the first place!

Any help would be appreciated. The data (with a \t separator) can be seen at http://pagesperso-orange.fr/jjayez/data.txt

Jacques Jayez
Laboratoire Langage, Cerveau, Cognition (http://l2c2.isc.cnrs.fr/en/)
Lyon, France