[R-lang] Re: How to handle missing data when I try to log-transform my data

Hugo Quené H.Quene@uu.nl
Wed Jun 23 12:10:04 PDT 2010


Dear list,

On 2010.06.23 20:08 , Scott Jackson wrote:
> If you go the route of substituting NAs for too-short times instead of
> completely deleting that observation from your data.frame, here's a
> tip.  The following does NOT work:
>
> data$RT[data$RT<  100]<- NA
>
> The following DOES work:
>
> is.na(data$RT[data$RT<  100])<- TRUE
>

My solution in this case is slightly different and more flexible, 
using the "subset" argument in lmer and lm:

myselection <- (data$RT<100) # create boolean vector
mymodel <- lmer( RT~ 1+cond+(1|subject)+(1|item),
	data=data, subset=myselection, ... )

(Verify the reported number of cases, subjects, items, etc.)

This solution leaves the original data intact, and it's easy to 
adjust the exclusion conditions and then re-run your analysis.

BTW, typical RTs are not normally distributed, so that a 
transformation is often necessary, using e.g. log(RT) or 1/RT as 
your dependent variable.

Best wishes, Hugo Quené

-- 
Dr Hugo Quené | Assoc Prof Phonetics | Departement Moderne Talen | 
Utrecht inst of Linguistics OTS | Utrecht University | Trans 10 | 
3512 JK Utrecht | The Netherlands | T +31 30 253 6070 | F +31 30 253 
6000 | H.Quene@uu.nl | www.hugoquene.nl | www.hum.uu.nl



More information about the ling-r-lang-L mailing list