[R-lang] Concerning glm with contrasts

Mon Jul 19 10:28:12 PDT 2010

Dear R-lang users,

I am new to the mailing list, and also rather new to R. I have a few
questions concerning the results of glm().

I am doing a study comparing the frequencies of different linguistic
constructions used in a specific text that is in three languages (Japanese,
Chinese, and English). The results I got are the following.

Transitive  Passive Intransitive Adjectival Others Total  Japanese 164 9 291
36 8 508  Chinese 198 3 221 69 17 508  English 174 31 214 57 32 508
536 43 726 162 57 1524
Chi-square test has a significant result. I intended to do further analysis
to see if there is any difference among the languages, so i did the
following:

> glm.out4<-glm(freq~language*constructions, data=comps2.data,
family=poisson, contrasts=list(language=contrastml,
constructions=contrastmc))
> summary(glm.out4)

Call:
glm(formula = freq ~ language * constructions, family = poisson,
    data = comps2.data, contrasts = list(language = contrastml,
        constructions = contrastmc))

Deviance Residuals:
 [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Coefficients:
                         Estimate Std. Error z value Pr(>|z|)
(Intercept)               3.93111    0.05890  66.746  < 2e-16 ***
language1                 0.20443    0.08435   2.424 0.015363 *
language2                 0.16064    0.09497   1.692 0.090740 .
constructions1           -1.25129    0.06779 -18.459  < 2e-16 ***
constructions2            1.68783    0.18775   8.990  < 2e-16 ***
constructions3           -0.01655    0.08647  -0.191 0.848205
constructions4            1.12805    0.13321   8.468  < 2e-16 ***
language1:constructions1  0.12190    0.09726   1.253 0.210090
language2:constructions1  0.26651    0.10562   2.523 0.011625 *
language1:constructions2  0.15838    0.24722   0.641 0.521744
language2:constructions2 -0.98403    0.32782  -3.002 0.002684 **
language1:constructions3 -0.15971    0.12915  -1.237 0.216218
language2:constructions3  0.44708    0.12620   3.543 0.000396 ***
language1:constructions4 -0.51918    0.21538  -2.411 0.015931 *
language2:constructions4  0.19079    0.18724   1.019 0.308207
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance:  1.3744e+03  on 14  degrees of freedom
Residual deviance: -3.1086e-15  on  0  degrees of freedom
AIC: 116.66

Number of Fisher Scoring iterations: 3

> contrasts(language)
         [,1] [,2]
Chinese     0   -1
English     1    1
Japanese   -1    0
> contrasts(constructions)
             [,1] [,2] [,3] [,4]
Adjectival      0    0   -1    0
Intransitive    1    1    1    1
Others          0    0    0   -1
Passive         0   -1    0    0
Transitive     -1    0    0    0

So my questions are:
(1) I am not sure how to interpret these results. Since language1 shows a
significance difference, does it mean that "English and Japanese are
significantly different in terms of the distribution of the different
constructions used"?
(2) Does the "intercept" represent anything at all? If yes, what does it
represent in this case?
(3) If I want to test whether English uses passive significantly more than
Japanese, and Japanese uses intransitive significantly more than English,
how should I modify the contrasts/commands?

Any help is appreciated.
Thanks.

-- 
Zoe Luk
Department of Linguistics
University of Pittsburgh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailman.ucsd.edu/mailman/private/ling-r-lang-l/attachments/20100719/d8a165d5/attachment.html