<OT> New Posting: ROA-1013

roa at ruccs.rutgers.edu roa at ruccs.rutgers.edu
Sun Feb 15 21:21:37 PST 2009


ROA 1013-0209

Learning Phonological Grammars for Output-Driven Maps

Bruce Tesar <tesar at rutgers.edu>

Direct link: http://roa.rutgers.edu/view.php3?roa=1013


Abstract:
The challenge of simultaneously learning a lexicon of underlying
forms and a constraint ranking has been addressed by several
scholars in recent work (Apoussidou 2007, Jarosz 2006, Merchant
2008, Tesar 2006). In particular, the proposal of Merchant,
the Contrast Pair and Ranking information algorithm (CPR),
avoids having to explicitly enumerate all possible underlying
forms for each morpheme (in contrast to Apoussidou and Jarosz),
and also avoids having to explicitly enumerate all possible
constraint rankings (in contrast to Jarosz).


While CPR avoids those computational traps, there are still
some components of CPR (and of the related work by Tesar)
that pose computational difficulties. (1) The focus of CPR
on lexical hypotheses for only a pair of related words at
a time (a contrast pair) is a vast improvement over simultaneous
consideration of all possible lexica, but the space of lexical
hypotheses for a single contrast pair still grows exponentially
in the number of unset underlying features for the morphemes
involved in the pair. (2) The technique of initial lexicon
construction, setting in advance features that do not alternate,
can restrict further the number of lexical hypotheses that
need to be considered, but at the cost of requiring that
the learner have a complete paradigm of surface form data
before learning of underlying forms can begin. (3) The extraction
of ranking information performed by CPR is able to obtain
ranking information from contrast pairs for which complete
underlying forms have not yet been determined, but also
faces exponential computational complexity, due in part
to the fact that the procedure is separately computing the
ranking implications of each lexical hypothesis in the (exponenti
ally growing) set of possible hypotheses for the contrast pair.


The current paper demonstrates that each of these computational
concerns can be significantly improved upon by taking the
structure of grammars into greater consideration. The key
grammatical structure lies in Tesar's proposal of output-driven
map (Tesar 2008). Intuitively, an output-driven map is a
phonological map in which all disparities introduced between
the input and the output are motivated by conditions on
the output. This notion is formalized by the requirement
that any grammatical input-output mapping A->C entails the
grammaticality of B->C whenever B has 'greater similarity'
to C than A does (A->C has every input-output disparity
that B->C does, but B->C may lack some disparities of A->C).
An output-driven map is necessarily a restricted identity
map (Prince and Tesar 2004), meaning that every grammatical
form maps to itself, a property assumed to hold of grammars
in much learnability work, including that of Merchant. Output-dri
ven maps can be viewed as a strengthened version of restricted
identity maps.


The structure of output-driven maps can be exploited in
learning via the contrapositive: B~->C entails A~->C. Given
a grammatical output C, it is a given that C->C (restricted
identity map property). Suppose B has one disparity with
C (e.g., they differ in the value of one feature on one
segment). If the learner possesses sufficient information
to determine that B cannot map to C, then the learner need
not bother checking to see if A maps to C; because the map
is output-driven, any input which has, relative to C, all
of the disparities of B plus additional ones cannot be grammatica
l. All lexical hypotheses which include all of the disparities
of B->C may be dismissed without evaluation. Instead of
needing to evaluate all combinations of possible values
for all unset features of a word (exponential in the number
of unset features), the learner can obtain the same information
while only evaluating a single unset feature at a time (linear
in the number of unset features), having the other unset
features match (temporarily) the values of their output
correspondents, addressing concern (1). If a word has eight
unset binary features, this means evaluating 8 lexical hypotheses
instead of 256. Even greater benefit is realized when obtaining
ranking information from forms with unset features, addressing
concern (3).


The speed-up realized by exploiting the structure of output-drive
n maps is significant enough that initial lexicon construction
is no longer needed. This frees the learner from needing
an entire paradigm before learning commences; the learner
can begin learning about underlying forms from even a single
datum, addressing concern (2). This algorithm has the notable
property that features of underlying forms which cannot
be shown to require a particular value remain unset; non-contrast
ive features are never set, without any need for the learner
to separately construct an 'inventory of contrastive features'.


References


Apoussidou, Diana. 2007. The Learnability of Metrical Phonology.
PhD. dissertation, University of Amsterdam, Amsterdam.


Jarosz, Gaja. 2006. Rich Lexicons and Restrictive Grammars
- Maximum Likelihood Learning in Optimality Theory. PhD.
dissertation, The Johns Hopkins University, Baltimore, MD.
ROA-884.


Merchant, Nazarré. 2008. Discovering underlying forms:
Contrast pairs and ranking. PhD. dissertation, Rutgers University
, New Brunswick. ROA-964.


Prince, Alan, and Tesar, Bruce. 2004. Learning phonotactic
distributions. In Constraints in Phonological Acquisition,
ed. by René Kager, Joe Pater, and Wim Zonneveld, 245-291.
Cambridge: Cambridge University Press.


Tesar, Bruce. 2006. Learning from paradigmatic information.
In Proceedings of the 36th Meeting of the North East Linguistics
Society, ed. by Christopher Davis, Amy Rose Deal, and Youri
Zabbal, 619-638. GLSA. ROA-795.


Tesar, Bruce. 2008. Output-driven maps. Ms., Linguistics
Dept., Rutgers University. ROA-956.

Comments: To appear in NELS 39.
Keywords: stress, underlying forms
Areas: Phonology,Learnability,Computation
Type: Conference Proceedings Chapter

Direct link: http://roa.rutgers.edu/view.php3?roa=1013



More information about the Optimal mailing list