[R-lang] p-values for factors using lmer & mcmc

Thu Aug 30 14:38:03 PDT 2007

On 8/30/07, Kathryn Campbell-Kibler <kathryn at campbell-kibler.com> wrote:
>
> Hi all,
>
> I recently upgraded versions of R and the package lme4 in a fit of
> something-or-other.  My older version was old enough to still have
> p-values.  I now have the current version (0.99875-7), and am learning
> to use mcmc to calculate p-values, but basically all my independent
> variables are factors, so p-values for each level are not really
> helpful, I need to estimate the impact of the whole factor. I can live
> without p-values on the model itself, but the lack of them in anova is
> killing me. I've been trying out mcmcpvalue from here:
>
> http://wiki.r-project.org/rwiki/doku.php?id=guides:lmer-tests
>
> But it seems to only work for the linear models, not the glmms.  Is
> there something out there for those, or a way to adapt this script for
> it?
>
> Also, I'm not really understanding the structure well enough to get
> how to get it to evaluate interactions (or if it will).  If I'm
> looking at something like this:
>
> > HPDinterval(status_samp)
>                                                   lower       upper
> (Intercept)                                  3.96106199  4.54810253
> sregionivan                                 -1.49569838 -0.32674386
> sregionjason                                -1.08346050  0.06535123
> sregionsouth                                -0.96097370 -0.13712609
> ininging                                    -0.15327601  0.19483823
> factor(workingclass)1                       -1.40345872 -0.53510599
> sregionivan:ininging                        -0.28212347  0.48140623
> sregionjason:ininging                        0.00660252  0.72089386
> sregionsouth:ininging                       -0.34948297  0.17119827
> sregionivan:factor(workingclass)1            0.19306970  1.56657035
> sregionjason:factor(workingclass)1          -0.13693835  2.36486351
> sregionsouth:factor(workingclass)1           0.10013804  1.11524003
> ininging:factor(workingclass)1               0.60875674  1.78589051
> sregionivan:ininging:factor(workingclass)1  -2.36086950 -0.34788386
> sregionjason:ininging:factor(workingclass)1 -2.16553468  0.84887041
> sregionsouth:ininging:factor(workingclass)1 -1.65891153 -0.26497955
> log(sigma^2)                                -0.56604768 -0.37239636
> log(id.(In))                                -2.44938452 -1.67803069
> log(word.(In))                              -2.24918742 -0.96898869
> attr(,"Probability")
> [1] 0.95
>
> which lines together give me the interaction of ining and
> factor(workingclass)?  Is it just
>
> ininging:factor(workingclass)1               0.60875674  1.78589051
>
> or is it
>
> ininging                                    -0.15327601  0.19483823
> factor(workingclass)1                       -1.40345872 -0.53510599
> ininging:factor(workingclass)1               0.60875674  1.78589051

the former. the interaction term is ininging:factor(workingclass)1, but it's
interpreted in the context of the main effects. so for you,

there's an (insignificant) decrease if ininging is true and one
(significant) effect decrease factor(workingclass) is 1. additionally
there's an increase if both is true (on top of what's given by the main
effects). of course, it's important to be aware that interactions usually
lead to collinearity, especially if do not center your variables (see
"scale()").

One tempting option is to look at the p-values for anova comparing two
> models, one with and one without the term or interaction I'm
> interested in.  But searching on the R-help list tells me that's not a
> good idea, as it is anti-conservative.  Can someone explain why, or
> point me to a good explanation (where good=using small non-technical
> words)?

are we still talking about non-linear models? what is your linking function.
in lmer mixed logit models are fitted using penalized quasi-likelihood
maximization, which in small non-technical words means that when you compare
those likelihoods (measures of model fit) for two logit models (one with and
one without a parameter/predictor of interest) you could even end up finding
that the bigger model is less likely (which cannot happen with maximum
likelihood fits). that makes comparing two mixed logit models by means of
the anova() function (i.e. by means of likelihood ratios) problematic. but
mixed logit models should have p-values for the coefficients (based on the
wald statistic). so what type of model are you using?

as for your the fact that you're interested in entire factors rather than
parameters - why? if a factor with 4 levels is significant according to
model comparison, but none of the parameters/coefficients associated with
that factor reaches significance in the model that isn't good anyway (could
be overfitting, could be the wrong coding of levels, could be too many
levels that don't matter). wouldn't you want to know anyway which contrasts
actually contain information? for that you can recode your three level
factor in two binary factors and then see whether there coefficients are
significant.

sorry, that this is all over the place. it would help to know what linking
function you are interested in. what GLMM are you using?

Florian

Any help is much appreciated.
>
> Thanks,
>
> Kathryn
> _______________________________________________
> R-lang mailing list
> R-lang at ling.ucsd.edu
> https://ling.ucsd.edu/mailman/listinfo.cgi/r-lang
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://ling.ucsd.edu/pipermail/r-lang/attachments/20070831/3df8ac0f/attachment.htm