[Lign251] homework question

Roger Levy rlevy at ucsd.edu
Mon Oct 20 13:34:47 PDT 2008


Hi Ross,

Ross Metusalem wrote:
> Hi Roger,
> 
> I have a couple questions regarding #3 from the homework. First, we're
> supposed to plot histograms of the number of syllables over a) word
> types and b) tokens; "brown-counts-lengths-nsyll" contains the word
> counts with which to do the latter, but what information are we
> supposed to use to do the former? 

Very simple -- when modeling by type, each item in the lexicon counts as
one observation.

A useful hint, too -- you can create a version of the Nsyl vector with
one entry per token by doing

rep(dat$Nsyl, dat$Count)


> Also, when we try to fit binomial or
> geometric distributions to the histograms, I get the impression that
> we are expected to simply look at distributions with different
> parameters and see which best fit the data, but I want to make sure
> that this is correct and that we aren't instead supposed to use a
> method of parameter estimation discussed in class.

Good question.  At this point, just eyeballling it is fine -- no need to
use a formal technique of parameter estimation.

Roger

-- 

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy



More information about the Lign251 mailing list