[R-lang] High collinearity

Marco dutch.linguistics at gmail.com
Sat Apr 11 05:55:39 PDT 2009


Dear R-langs,

I have a data set that contains highly correlated variables (> .90), all of
which are variables that occur on the same time scale. I crucially want to
determine whether one of these variables (End) has explanatory power on top
of all the other ones. In this case, is it legitimate to take the residuals
of End (fitting an lm model, in which we explain End with all other,
correlated variables), and then running an lmer model that only contains
resid_end? When I look at the results I obtain, it seems like the other
correlated variables result in corrupted residuals for End. Are there any
other methods to deal with (and distinguish between) highly correlated
variables in R?  Or could you tell me whether it is valid to use these
residuals (and the F values obtained for these residuals), even though the
beta coefficients are uninterpretable?

Thanks in advance!

Marco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pidgin.ucsd.edu/pipermail/r-lang/attachments/20090411/7f7f5c5b/attachment.htm>


More information about the R-lang mailing list