[R-lang] Re: plotting residuals

bogartz@psych.umass.edu bogartz@psych.umass.edu
Tue Apr 13 11:58:50 PDT 2010


I suggest that the desire "to show there's no interesting trend in the  
residuals" is understandable but the use of the residuals to find out  
where the model goes wrong is another helpful approach.  The emphasis  
on goodness of fit tends to detract from the exploration of the  
important information in badness of fit.

Quoting Steven Piantadosi <piantado@mit.edu>:

> Hi Kyle,
>
> plot(..) called on an lm object will call plot.lm, which you can read
> about with "?plot.lm"--the help screen describes the plots and has some
> references for regression diagnostics.
>
> If you would like to extract the residuals from the regression and do
> something else with them, you can call the function residuals( ... ) on
> an lm object.
>
> If you are interested in checking for Zipfian distributions, there are
> some people who have derived tests for it (e.g Urzua 2000: "A simple and
> efficient test for Zipf’s law" has one, and cites a few others). Urzua
> argues against doing a linear regression on log rank vs. log frequency
> because the residuals are not normal--he suggests you should really work
> with tests and diagnostics that respect the distribution implied by
> Zipf's law.
>
> Hope that helps!
>
> ++Steve
>
>
>
>
>
>>
>> I have a nice instance of Zipfian (well, power law) dynamics that  
>> I've plotted (log-rank vs. log-frequency). I also fit a linear  
>> model to the data in log-space and want to show there's no  
>> interesting trend in the residuals except for at the endpoints.  
>> Simply plotting the residuals seems a bit simple, but I don't quite  
>> understand what the different measures are that are shown by  
>> calling plot() on the lm() instance. Does anyone have a particular  
>> idea about which of the four different plots is most relevant and  
>> easy to interpret/explain, or could point me to a useful resource?  
>> Also, if I end up just plotting residuals and fitting some kind of  
>> smoother to them, should I do it in the absolute space, or in  
>> something like a log and/or absolute value domain?
>>
>> Thanks,
>> Kyle Gorman
>> U of Pennsylvania
>
>
>



Richard S. Bogartz
Professor of Psychology
UMASS, Amherst 01003



More information about the ling-r-lang-L mailing list