[R-lang] Re: Positive and negative logLik and BIC in model comparison (lmer)
Nathaniel Smith
njs@pobox.com
Mon Aug 1 11:52:30 PDT 2011
On Mon, Aug 1, 2011 at 6:45 AM, piantado <piantado@mit.edu> wrote:
> Well, I think which you choose mainly depends on your theoretical
> outlook (although people on the list may disagree?) since these are
> motivated by information-theoretic (AIC) vs. bayesian (BIC) ideals. For
> non-tiny data sets, BIC more harshly penalizes free parameters (factor
> of log(n) rather than 2 on k).
Sadly, it's rather more complicated than that. The best paper I've
come across for these issues is:
Burnham, K. P., & Anderson, D. R. (2004). Multimodel Inference:
Understanding AIC and BIC in Model Selection. Sociological Methods
Research, 33(2), 261-304. doi:10.1177/0049124104268644
As the abstract notes: "AIC can be justified as Bayesian using a
“savvy” prior on models that is a function of sample size and the
number of model parameters. Furthermore, BIC can be derived as a
non-Bayesian result. Therefore, arguments about using AIC versus BIC
for model selection cannot be from a Bayes versus frequentist
perspective."
I have a vague preference for the AIC because I have a vague
understanding of where the magic numbers in the formula come from
(it's an estimate of the KL-divergence between your model and the true
model -- the actual log-likelihood gives a biased estimate of this
quantity, and the "2k" comes from a first-order Taylor approximation
of the bias term; subtracting it off makes it approximately unbiased).
I don't have the equivalent understanding for the BIC. But this is
probably just because I haven't yet spent the time to finish reading
the above paper.
-- Nathaniel
More information about the ling-r-lang-L
mailing list