[R-lang] Re: plotting residuals

Tue Apr 13 11:05:52 PDT 2010

Hi Kyle,

plot(..) called on an lm object will call plot.lm, which you can read
about with "?plot.lm"--the help screen describes the plots and has some
references for regression diagnostics.

If you would like to extract the residuals from the regression and do
something else with them, you can call the function residuals( ... ) on
an lm object.

If you are interested in checking for Zipfian distributions, there are
some people who have derived tests for it (e.g Urzua 2000: "A simple and
efficient test for Zipf’s law" has one, and cites a few others). Urzua
argues against doing a linear regression on log rank vs. log frequency
because the residuals are not normal--he suggests you should really work
with tests and diagnostics that respect the distribution implied by
Zipf's law.

Hope that helps!

++Steve

> 
> I have a nice instance of Zipfian (well, power law) dynamics that I've plotted (log-rank vs. log-frequency). I also fit a linear model to the data in log-space and want to show there's no interesting trend in the residuals except for at the endpoints. Simply plotting the residuals seems a bit simple, but I don't quite understand what the different measures are that are shown by calling plot() on the lm() instance. Does anyone have a particular idea about which of the four different plots is most relevant and easy to interpret/explain, or could point me to a useful resource? Also, if I end up just plotting residuals and fitting some kind of smoother to them, should I do it in the absolute space, or in something like a log and/or absolute value domain? 
> 
> Thanks,
> Kyle Gorman
> U of Pennsylvania