[R-lang] Post-hoc comparison for incorrect responses in glmer

Tue Nov 19 07:30:13 PST 2013

I'm trying to understand how to obtain a z and p value for pairwise 
comparisons on probability of the non'default outcome of a binary 
categorical variable.

My data set includes a binary categorical DV (correct or incorrect). In 
a nutshell, I'm looking at Chinese speakers of English's choice of  's 
or /of /possessives when recalling a sentence that contained either of 
the two possessive structures 's or /of/. This choice is, however, 
moderated by the animacy of the elements participating in the possessive 
construction. That is, the sentence to be recalled had either structure 
AN_IN or IN_AN which makes for a second IV with two levels.

The data obtained looks something like the below (my apologies for not 
reorganising this according to correct vs incorrect for display purposes).

Responses in AN_IN contexts

Structure to be recalled

Use of 's

Use of 'of'

Incomplete

No recall

Total Cases

Count

% of Total

Count

% of Total

Count

% of Total

Count

% of Total

Count

% Total

's

70

65

9

7

20

18

9

8

108

100

of

3

3

72

69

20

19

9

8

108

100

Total

73

34

84

39

41

19

18

9

216

100

Responses in IN_AN contexts

Structure to be recalled

Use of 's

Use of 'of'

Incomplete

No recall

Total Cases

Count

% of Total

Count

% of Total

Count

% of Total

Count

% of Total

Count

% Total

's

68

63

16

15

16

15

8

7

108

100

of

12

12

69

65

20

18

5

5

108

100

Total

81

37

86

40

36

17

13

6

206

100

What I'm trying to get to the bottom of is whether the proportion of 
mismatches (incorrect response) when recalling /of/ and 's sentences for 
one semantic combination (say IN_AN) is significantly higher or lower 
than the other (AN_IN in top table). This means whether, in the IN_AN 
contexts, counts of 12 and 16 for use of 's in /of/ contexts and use of 
/of/ in 's contexts respectively are significantly higher than their 
counterparts in the other table.

If my understanding is correct, the model I have fit tells me there is a 
significant effect for semantic context
but the z and p value apply to the log-odds of obtaining a CORRECT 
response. Right?

Generalized linear mixed model fit by maximum likelihood ['glmerMod']

Family: binomial ( logit )

Formula: Correct2 ~ Semantics + (1 + Syntax | ID)

Data: CodedQ3

AICBIClogLikdeviance

236.6061255.5098 -113.3030226.6061

Random effects:

Groups NameVariance Std.Dev. Corr

ID(Intercept) 0.78430.8856

Syntaxs4.04862.0121-0.80

Number of obs: 324, groups: ID, 27

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept)-3.02110.3735-8.088 6.07e-16 ***

SemanticsIN_AN1.12490.41702.6980.00698 **

---

Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:

(Intr)

SmntcsIN_AN -0.792

Setting other considerations of model fit and collinearity aside for the 
moment, if I want to obtain a z and p value for the difference between 
proportions of incorrect responses by semantic context, how should I 
proceed?

In passing, could someone also point out how I would look at pairwise 
comparisons once I factor in interactions, as if, for example, I had 
included an interaction for Syntax in the model above.

The comparisons would look like this:

syntaxof:IN_AN vs. syntaxof:AN_IN
syntax's:IN_AN vs syntax's:AN_IN

->where 'correct' is the reference level for the DV;

syntaxof:IN_AN vs. syntaxof:AN_IN
syntax's:IN_AN vs syntax's:AN_IN

->where 'incorrect' is the reference level of the DV.

Any guidance is much appreciated.

-- 
Frank Romano

Sabanci University
website:http://sabanciuniv.academia.edu/FrancescoRomano

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20131119/9146bec5/attachment-0001.html