<OT> New Posting: ROA-896
Rutgers Optimality Archive
roa at ruccs.rutgers.edu
Sun Feb 18 14:01:57 PST 2007
ROA 896-0107
Finding the Right Words: Implementing Optimality Theory with Simulated Annealing
Tamás Bíró <birot at nytud.hu>
Direct link: http://roa.rutgers.edu/view.php3?roa=896
Abstract:
This dissertation presents an implementation of Optimality Theory (OT)
that also aims at accounting for certain variations in speech. The
Simulated Annealing for Optimality Theory Algorithm (SA-OT, Fig. 2.8,
on page 64) combines OT with simulated annealing, a widespread
heuristic optimisation technique. After a general introduction to
Optimality Theory and the discussion of certain 'philosophical
background questions' (especially on the role of probabilities in
linguistics; Chapter 1), the SA-OT Algorithm is introduced (informally
in section 2.2, mathematically in sections 3.3 and 3.4), put into a
broader context (section 2.1, Chapter 4, and sections 8.2 and 8.3),
and experimented with (section 2.3, Chapters 5-7). The mathematical
underpinning of SA-OT is preceded by a formal analysis of OT,
including the representation of violation profiles using polynomials
and ordinal numbers.
As section 2.1 argues, heuristic algorithms-- such as SA-OT-- may
serve as adequate models of the computations performed by the human
brain for at least three reasons: (1) many of these algorithms are
simple, (relatively) efficient and produce some output within a
predefined time span, even if (2) they may make errors, and finally
(3) the algorithm can be speeded up with a price to be paid in reduced
precision. A faster computation is possible, but more prone to make
errors. The adequacy of such a model is corroborated if besides the
grammatical forms it also reproduces the empirically observable error
patterns under different conditions. Importantly, these predictions
are quantitative, and the algorithm's parameters can 'fine-tune' the
output frequencies of the erroneous or alternating forms.
Therefore, SA-OT is claimed to be a model of linguistic performance
(Table 2.1, page 43). By distinguishing between a linguistic model and
its implementation, one can account for both linguistic competence and
certain types of linguistically motivated performance phenomena. Thus
an adequate linguistic model (a grammar, such as a well-founded OT
grammar) predicts correctly which forms are judged as grammatical by
the native speaker. This layer refers to the static knowledge of the
language in the native speaker's brain. On top of that is built the
implementation of the grammar as a model of the dynamic language
production process.
In particular, SA-OT requires a topology (a neighbourhood structure)
on the OT candidate set. Consequently, the notion of a local optimum
is introduced: a candidate that is more harmonic than all its
neighbours is a local optimum, independently of whether it is the most
harmonic element of the entire candidate set. Local optima are the
candidates that can emerge as outputs in SA-OT. The global optimum
predicts the grammatical form, whereas all other outputs should model
performance errors.
The second part of the dissertation experiments with SA-OT, introduces
a few techniques and tricks, and analyzes the role of its parameters.
For that purpose, the following phonological phenomena are modelled:
metrical stress shifts in Dutch fast speech (cf. Schreuder, ROA-846),
regressive and progressive voice assimilation, cliticization of the
Hungarian definite article and syllabification (Prince and Smolensky's
basic CV theory).
Interesting side results include arguments for including loser (never
winning) candidates into the candidate set and new types of ranking
arguments, both based on occurrence frequencies. After a comparison to
existing OT varieties and non-linguistic cognitive models, the
dissertation concludes that future research should decide whether
SA-OT or its competitors, the already existing stochastic OT models
are more fruitful. But I believe that they may complement each other.
Comments: The three files contain the dissertation, the English
summary and the 'stellingen'. Electronic version:
http://dissertations.ub.rug.nl/faculties/arts/2006/t.s.biro/
Keywords: simulated annealing; variation; frequency; linguistic
performance; fast speech; voice assimilation; syllabification;
infinite candidate set; loser candidates; ranking arguments;
polynomials; ordinal numbers;
Areas: Phonology, Computation, Formal Analysis
Type: PhD Dissertation
Direct link: http://roa.rutgers.edu/view.php3?roa=896
More information about the Optimal
mailing list