[R-lang] Question about power

Wed Aug 3 14:26:13 PDT 2011

Hi,

I have a question about making inferences when power might be an issue.  I'm examining whether a variable has a significant effect in different parts of the syllable.  To do this, I have 2 different data sets, Onset and Coda, which I'm using to determine if the variable has effects in the syllable onset and coda, respectively.  The variable is significant (very small p-value) in the onset but is marginally significant in the coda (p= .055 in the full model, and model comparison with a baseline model that does not contain this variable gives a p-value of .07).  

While it's always difficult to know how to interpret a marginally significant effect, one issue that complicates the matter is that the Coda dataset has fewer items and trials than the Onset dataset.  One thing that I'd like to do is determine whether the marginal effect could simply be due to a lack of power.  My idea was to take a random sample of the Onset dataset so that it matches the size of the coda dataset and see if the variable of interest remains significant even in this reduced dataset.  I figure that I would need to do this sampling many times (e.g., 10,000 times) to make sure that the effect is robust.

Is this a sensible approach?  Am I going to run into a Type I/II error situation by doing 10,000 model comparisons?

Thank you,
Ariel