[R-lang] Removing collinearity from control variables
Ariel M. Goldberg
ariel.goldberg@tufts.edu
Sun Jan 9 16:08:53 PST 2011
Dear R-langers,
I am working with data from the English Lexicon Project and am using the variables described by Baayen, Feldman & Schreuder (2006) to control for the basic factors that influence reading time (frequency, length, etc). My goal is to determine if other variables are significant after having controlled for these factors. I'd like to remove the collinearity from Baayen et al's variable set and I was wondering if you had any suggestions as to what might be the best way to do this. I was thinking that PCA might be the best, particularly since I'm not concerned with the interpretation of variables at all. Do you think that's a good way to go about it?
Also, if PCA is good, I have a quick question. Do I use all the principle components it creates, in order to account for 100% of the variance? I think this makes sense since again, I'm not trying to interpret the various components.
Thanks!
Ariel
More information about the ling-r-lang-L
mailing list