[R-lang] Re: How to handle missing data when I try to log-transform my data

Scott Jackson scottuba@gmail.com
Wed Jun 23 11:08:10 PDT 2010


I may be missing something, but I'm not sure why you would want to
replace 100ms with 0ms.  Surely that introduces much more bias than if
you had just left the too-low RTs in?

I agree with the others that simply rejecting trials with too-low RTs
is probably the way to go.  At least, that's a common practice,
especially if you're using lmer or another analysis that does not
require balances data.

Alternatively, depending on what kind of data you have, you might try
a more sophisticated missing-data imputation technique, like multiple
imputation.  There's a very nice R package for imputation called
"mice" that I have been using extensively lately, which has very
readable and helpful documentation, even if you're new to multiple
imputation.  There's another package called "Amelia" that I have not
used, but which also has excellent-looking documentation.

If you go the route of substituting NAs for too-short times instead of
completely deleting that observation from your data.frame, here's a
tip.  The following does NOT work:

data$RT[data$RT < 100] <- NA

The following DOES work:

is.na(data$RT[data$RT < 100]) <- TRUE

good luck,
-scott

On Wed, Jun 23, 2010 at 1:24 PM, Xiao He <praguewatermelon@gmail.com> wrote:
> Dear R-lang users
> I have a question that is, I suppose, less related to the use of R.
> I have a set of self-paced reading data, and all the RTs that are below
> 100ms are to be discarded. What I used to do when analyzing raw data was to
> replace discarded values with 0. That was all simple and easy. But I
> recently started to analyze log-transformed data. An issue then arises as to
> how to handle missing data. Obviously, if I replace the discarded raw data
> points with 0, log transformation does not work, as it will return "-Inf"
> for obvious reasons. So I would like to know what you would suggest me to do
> in my case. Thank you very much in advance.
>
>
> Xiao He
>


More information about the ling-r-lang-L mailing list