[R-lang] Re: What happens if I include a continuous variable?

David Reitter reitter@cmu.edu
Thu Sep 16 05:17:18 PDT 2010


Hi Roger,

On Sep 16, 2010, at 6:54 AM, Roger van-Gompel wrote:
> 
> However, let’s say I failed to control for word length, and length
> and frequency tend to be highly correlated. Can I then include length as
> a continuous variable (after centering) in order to deconfound frequency
> from length?

In principle yes, but collinearity is a caveat.

> In other words, if I still get an effect of frequency in the second
> model, does this mean that this effect is not due to/not confounded with
> length?

Yes, that's a good start, but have a look at the correlation matrix.   If, after centering, you still find substantial correlations (perhaps >0.2) between frequency or length and their interaction, you will want to do something about it.  Otherwise, significance tests of the fitted effects won't tell you much, as these variables are confounded.  See also:

http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
(pages 117ff)

or;  http://en.wikipedia.org/wiki/Multicollinearity

One way to address collinearity is to regress out length from frequency first, e.g. creating testdata$FreqWithoutLen from something like resid(lm(Freq ~ Len)).

I wonder if it would be OK to do stepwise regression, i.e. regressing out length from your response variable (lexdectime) first, and then fitting the main model.
Similarly, I would think that finding significant changes (improvements) in fit when the Frequency variable is included would demonstrate the significance of its effect (use an ANOVA to compare the models, or just apply R's ANOVA to one model, which will stepwise compare the nested models).  Opinions?
This is from the perspective of your a-priori hypothesis being that frequency matters, and you are trying to harden your answer against the possible effect of length. 

(Finally, Length and Frequency may work better when log-transformed.)

Hope that helps.
- David


--
Dr. David Reitter
Department of Psychology
Carnegie Mellon University




More information about the ling-r-lang-L mailing list