[R-lang] Re: [R-sig-ME] Post-hoc comparison for incorrect responses in glmer

Levy, Roger rlevy@ucsd.edu
Sun Nov 24 11:19:00 PST 2013


Dear Frank,

First off, it is bad form to cross-post (simultaneously post your query to more than one newsgroup).  I’m responding only to r-lang.

Anyway, based on the heart of your query:

> What I'm trying to get to the bottom of is
> whether the proportion of mismatches (incorrect response) when
> recalling/of/  and 's sentences for one semantic combination (say IN_AN)
> is significantly higher or lower than the other (AN_IN in top table).


I can’t tell from your description whether you have multiple items, or whether the differences in semantic combination and syntactic presentation are between-speaker or within-speaker, or between- or within-items (if you have more than one item).  Assuming you have more than one item, and that both syntax and semantics vary within both speakers and items, I think you want to fit the model as follows:

  lmer(Correct ~ Syntax * Semantics + (Syntax * Semantics | Speaker) + (Syntax * Semantics | Item), data=name.of.your.data.frame, family=“binomial”, contrasts=list(Syntax=contr.sum(2),Semantics=contr.sum(2))

I believe the resulting model should give you a parameter estimate for the fixed effect of Semantics that more or less can be interpreted as the “main effect” of Semantics on correct response rate that you’re looking for.

More generally, I suggest that you carefully read Baayen et al. (2008), Jaeger (2008), Quené and van den Bergh (2008), and Barr et al. (2013) to get a better understanding of all the issues involved in why this would be the correct model specification.

Best

Roger

  

On Nov 23, 2013, at 7:23 AM, Francesco <fbromano@sabanciuniv.edu> wrote:

> 
> I have reposted this as some tabular information got jumbled by the
> ASCII/HTML difference.
> 
> My data set includes a binary categorical DV (correct or incorrect). In
> a nutshell, I'm looking at Chinese speakers of English's choice of  's
> or /of /possessives when recalling a sentence that contained either of
> the two possessive structures 's or/of/. This choice is, however,
> moderated by the animacy of the elements participating in the possessive
> construction. That is, the sentence to be recalled had either structure
> AN_IN or IN_AN which makes for a second IV with two levels.
> 
> The data obtained looks approximately like the below, values
> are not calculated exactly (my apologies for not
> reorganising this according to correct vs incorrect for display purposes).
> 
> 
> Responses in AN_IN contexts
> 
> structure to be recalled    Use of 's              Use of 'of'
> 
> 			count    % of total    count  % of total
> 's			135         64		14       7
> 
> 'of' 			15	    10		120      63
> 
> 
> 
> Responses in IN_AN contexts
> 
> structure to be recalled    Use of 's              Use of 'of'
> 
> 			count    % of total    count  % of total
> 's			101         60		12       7
> 
> 'of' 			25	    10		150      62
> 
> 
> I'm trying to understand how to obtain a z and p value for pairwise
> comparisons on probability of the non-default outcome of the binary
> categorical variable. What I'm trying to get to the bottom of is
> whether the proportion of mismatches (incorrect response) when
> recalling/of/  and 's sentences for one semantic combination (say IN_AN)
> is significantly higher or lower than the other (AN_IN in top table).
> This means whether, in the IN_AN
> contexts, counts of 15 and 14 for use of 's in/of/  contexts and use of
> /of/  in 's contexts respectively are significantly higher than their
> counterparts in the other table (25 and 12). How do I do this?
> 
> If my understanding is correct, the model I have fit below only tells me there is a
> significant effect for semantic context
> but the z and p value apply to the log-odds of obtaining a CORRECT
> response, not an incorrect one. Right?
> 
> 
> Generalized linear mixed model fit by maximum likelihood ['glmerMod']
> 
> Family: binomial ( logit )
> 
> Formula: Correct2 ~ Semantics + (1 + Syntax | ID)
> 
> Data: CodedQ3
> 
> AICBIClogLikdeviance
> 
> 236.6061255.5098 -113.3030226.6061
> 
> Random effects:
> 
> Groups       Name    Variance Std.Dev. Corr
> 
> ID(Intercept)        0.7843    0.8856
> 
> Syntaxs              4.0486    2.0121  -0.80
> 
> Number of obs: 324, groups: ID, 27
> 
> Fixed effects:
> 
>               Estimate   Std. Error    z value   Pr(>|z|)
> 
> (Intercept)    -3.0211    0.3735       -8.088   6.07e-16 ***
> 
> SemanticsIN_AN  1.1249    0.4170        2.698   0.00698 **
> 
> ---
> 
> Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Correlation of Fixed Effects:
> 
> (Intr)
> 
> SmntcsIN_AN -0.792
> 
> 
> 
> Setting considerations of model fit and collinearity aside for the
> moment, if I want to obtain a z and p value for the difference between
> proportions of INCORRECT responses by semantic context, how should I
> proceed?
> 
> In passing, could someone also point out how I would look at pairwise
> comparisons once I factor in interactions, as if, for example, I had
> included an interaction for Syntax in the model above.
> 
> The comparisons would look like this:
> 
> syntaxof:IN_AN vs. syntaxof:AN_IN
> syntax's:IN_AN vs syntax's:AN_IN
> 
> ->where 'correct' is the reference level for the DV;
> 
> syntaxof:IN_AN vs. syntaxof:AN_IN
> syntax's:IN_AN vs syntax's:AN_IN
> 
> ->where 'incorrect' is the reference level of the DV.
> 
> Any guidance is much appreciated.
> 
> -- 
> Frank Romano
> 
> Sabanci University
> website:http://sabanciuniv.academia.edu/FrancescoRomano
> 
> 
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the ling-r-lang-L mailing list