<OT> New Posting: ROA-858
roa at ruccs.rutgers.edu
roa at ruccs.rutgers.edu
Wed Aug 16 17:03:16 PDT 2006
ROA 858-0806
A Maximum Entropy Model of Phonotactics and Phonotactic Learning
Bruce Hayes <bhayes at humnet.ucla.edu>
Colin Wilson <colin at humnet.ucla.edu>
Direct link: http://roa.rutgers.edu/view.php3?roa=858
Abstract:
The study of phonotactics (e.g., the ability of English
speakers to distinguish possible words like 'blick' from
impossible words like *'bnick') is a central topic in phonology.
We propose a theory of phonotactic grammars and a learning
algorithm that constructs such grammars from positive evidence.
Our grammars consist of constraints that are assigned numerical
weights according to the principle of maximum entropy.
Possible words are assessed by these grammars based on the
weighted sum of their constraint violations. The learning
algorithm is robust against errors in the training data
and yields grammars that can capture both categorical and
gradient phonotactic patterns. The algorithm is not provided
with any constraints in advance, but uses its own resources
to form constraints and weight them. A baseline model, in
which Universal Grammar is reduced to a feature set and
an SPE-style constraint format, suffices to learn many phonotacti
c phenomena. In order to learn nonlocal phenomena such as
stress and vowel harmony, it is necessary to augment the
model with autosegmental tiers and metrical grids. Our results
thus offer novel, learning-theoretic support for such representat
ions.
We apply the model to English syllable onsets, Shona vowel
harmony, quantity-insensitive stress typology, and the full
phonotactics of Wargamay, showing that the learned grammars
capture the distributional generalizations of these languages
and accurately predict experimental findings.
Comments: Ms., Department of Linguistics, UCLA
Keywords: phonotactics, learning
Areas: Phonology,Learnability
Type: Manuscript
Direct link: http://roa.rutgers.edu/view.php3?roa=858
More information about the Optimal
mailing list