<OT> New Posting: ROA-884

Thu Dec 7 12:16:06 PST 2006

ROA 884-1206

Rich Lexicons and Restrictive Grammars - Maximum Likelihood Learning in Optimality Theory

Gaja Jarosz <jarosz at linguist.umass.edu>

Direct link: http://roa.rutgers.edu/view.php3?roa=884

Abstract:
This dissertation undertakes the full formal problem of
phonological learning - the learning of phonological lexicons
and restrictive phonological grammars that assign hidden
structure given only overt (unstructured) phonological forms
with associated morphemes (e.g. <[dagz], DOG+PLURAL>). A
major challenge is the learning of grammars that are simultaneous
ly restrictive and have generalizing capacity, two contradictory
requirements. The proposed solution, Maximum Likelihood
Learning of Lexicons and Grammars (MLG), combines a probabilistic
formulation of Optimality Theory (Prince and Smolensky 1993/2004)
with statistical learning via likelihood maximization.

Chapter 2 introduces the proposed theory of phonological
learning, whose central premise is that the correct grammar
and lexicon combination makes the overt forms most likely,
given richness of the base. The generalizing capacity of
grammars is attributed to formal linguistic theory, in particular
to implicational markedness universals. The identification
of restrictive grammars is the consequence of maximum likelihood
learning in conjunction with explicit reliance on richness
of the base.

Chapter 3 proposes EMGL, a possible algorithm for learning
within MLG. EMGL is a variant of the well-known Expectation-Maxim
ization algorithm. This procedure is demonstrated to successfully
learn general, restrictive grammars and correct lexicons
in a variety of artificial language systems with different
kinds of hidden structure: syllable structure, yer vowels,
voicing neutralization, and free variation.

While the ability of MLG to identify restrictive grammars
is due to reliance on likelihood maximization, its ability
to identify grammars with generalizing capacity depends
on the formal linguistic system it incorporates. Chapter
4 focuses on typological variation in the domain of syllable
structure, extending the formal linguistic system to account
for apparent implicational markedness inconsistencies in
this domain. The novel theory of syllable structure and
sonority restrictions, Headed Feature Domain Syllable Theory,
is applied to Polish, accounting for a variety of complex
sonority restrictions and building a foundation for the
modeling of the acquisition of syllable structure.

The predictions of MLG for the process of acquisition are
discussed in Chapter 5. The proposed learning theory builds
a foundation for the computational modeling of child phonological
acquisition, accounting for the end states of two stages
of acquisition, phonotactic learning and morphophonemic
learning, as well as the gradual transition between these
stages. Markedness is predicted to be the primary constraint
on possible acquisition paths, with frequency playing a
secondary role. The thesis presents stage-by-stage predictions
for the acquisition of onsets of varying sonority and complexity
in Polish based on observed frequencies of various onsets
in Polish.

Comments: 
Keywords: learnability, maximum likelihood, richness of the base, lexicon, Polish, syllable structure,
Areas: Phonology,Computation,Learnability
Type: PhD Dissertation

Direct link: http://roa.rutgers.edu/view.php3?roa=884