[Lign274] lign274-homework1-questions
Roger Levy
rlevy at ling.ucsd.edu
Thu Jan 28 11:10:21 PST 2010
Hi Wen,
Good questions. Here are some answers:
On Jan 28, 2010, at 1:05 AM, Wen-Hsuan Chan wrote:
> Hi Roger,
>
> I try to use the software on the server (just follow the MegaM How-
> To Guide)
> Here are some basic questions:
>
> 1. Should we run those procedures in our home directory?
> if running it in /local/contrib/lign274/data/
> log_linear_phonotactics, it shows "Permission denied"
>
> wec017 at morel:/local/contrib/lign274/data/log_linear_phonotactics
> $ ../../bin/megam.opt -lambda 0 -explicit -fvals -nobias multiclass
> sample_data_file > weights
> -bash: weights: Permission denied
>
> However, it does not work to run megam in my home directory. (-
> bash: ../../bin/megam.opt: No such file or directory)
Right -- the problem is that you don't have permission to write into
the log_linear_phonotactics directory. You should redirect the output
to your home directory somewhere, e.g.:
$ /local/contrib/lign274/bin/megam.opt -lambda 0 -explicit -fvals -
nobias multiclass /local/contrib/lign274/data/log_linear_phonotactics/
sample_data_file > ~/weights
>
> 2.
> Another error occurs when calling MegM with -predict<weight> option:
>
> wec017 at morel:/local/contrib/lign274/data/log_linear_phonotactics
> $ ../../bin/megam.opt -samefeat -explicit -fvals -predict
> sample_data_file.weighted multiclass \ sample_data_file
> Fatal error: exception Failure("error: "float_of_string" occured on
> line 1 when trying to parse "#" as a float")
>
> I'm not sure what it means??
Right -- sample_data_file.weighted is not a weights file, it is a
training-data file where the observations are weighted (e.g., setting
a weight of 2 means to treat this observation as if it occured
twice). After calling the command I mentioned above to train weights
and record them in your home directory, you could call
$ /local/contrib/lign274/bin/megam.opt -explicit -fvals -predict ~/
weights multiclass /local/contrib/lign274/data/log_linear_phonotactics/
sample_data_file > ~/predictions
The individual predictions would then be recorded in ~/predictions.
> 3. For output of 2, like this:
> 0 0.73949975031254744362 0.24389340542259144162
> 0.01660684426486126741
> first column is predicted outcome class.
>
> i think i still feel confused of the response class. Under the
> context of phonotactics knowledge, my understanding is that our goal
> is to find out the distribution of possible sound sequence. So what
> does the "response class" mean here? Also, what do the following 3
> number stand for? Can we think they are predicted probabilities for
> each class?
The response class is the phonological sequence in question. You have
to associate an integer with each sequence (basically just put the
sequences in some order and then number them starting from zero). The
output of -predict is then just (1) the response class with maximum
probability; and (2) the probabilities of all the response classes.
>
> 4. "Define a small set of feature functions..." Does it mean that
> converting word_suffix into the format like sample_data_file? Each
> phoneme is represented as "i # f_11 r_11..." according to features
> we define. And then we can use this file in the MegaM..
Yes, that's right.
Don't hesitate to ask more!
--
Roger Levy Email: rlevy at ling.ucsd.edu
Assistant Professor Phone: 858-534-7219
Department of Linguistics Fax: 858-534-4789
UC San Diego Web: http://ling.ucsd.edu/~rlevy
More information about the Lign274
mailing list