[Lign274] lign274-homework1-questions

Roger Levy rlevy at ling.ucsd.edu
Thu Jan 28 11:10:21 PST 2010


Hi Wen,

Good questions.  Here are some answers:

On Jan 28, 2010, at 1:05 AM, Wen-Hsuan Chan wrote:

> Hi Roger,
>
> I try to use the software on the server (just follow the MegaM How- 
> To Guide)
> Here are some basic questions:
>
> 1. Should we run those procedures in our home directory?
> if running it in /local/contrib/lign274/data/ 
> log_linear_phonotactics, it shows "Permission denied"
>
> wec017 at morel:/local/contrib/lign274/data/log_linear_phonotactics 
> $ ../../bin/megam.opt -lambda 0 -explicit -fvals -nobias multiclass  
> sample_data_file > weights
> -bash: weights: Permission denied
>
> However, it does not work to run megam in my home directory. (- 
> bash: ../../bin/megam.opt: No such file or directory)

Right -- the problem is that you don't have permission to write into  
the log_linear_phonotactics directory. You should redirect the output  
to your home directory somewhere, e.g.:

   $ /local/contrib/lign274/bin/megam.opt -lambda 0 -explicit -fvals - 
nobias multiclass /local/contrib/lign274/data/log_linear_phonotactics/ 
sample_data_file > ~/weights

>
> 2.
> Another error occurs when calling MegM with -predict<weight> option:
>
> wec017 at morel:/local/contrib/lign274/data/log_linear_phonotactics 
> $ ../../bin/megam.opt -samefeat -explicit -fvals -predict  
> sample_data_file.weighted multiclass \ sample_data_file
> Fatal error: exception Failure("error: "float_of_string" occured on  
> line 1 when trying to parse "#" as a float")
>
> I'm not sure what it means??

Right -- sample_data_file.weighted is not a weights file, it is a  
training-data file where the observations are weighted (e.g., setting  
a weight of 2 means to treat this observation as if it occured  
twice).  After calling the command I mentioned above to train weights  
and record them in your home directory, you could call

   $ /local/contrib/lign274/bin/megam.opt -explicit -fvals -predict ~/ 
weights multiclass /local/contrib/lign274/data/log_linear_phonotactics/ 
sample_data_file > ~/predictions

The individual predictions would then be recorded in ~/predictions.

> 3.  For output of 2, like this:
> 0 	0.73949975031254744362 	0.24389340542259144162 	 
> 0.01660684426486126741
> first column is predicted outcome class.
>
> i think i still feel confused of the response class. Under the  
> context of phonotactics knowledge, my understanding is that our goal  
> is to find out the distribution of possible sound sequence. So what  
> does the "response class" mean here? Also, what do the following 3  
> number stand for? Can we think they are predicted probabilities for  
> each class?

The response class is the phonological sequence in question. You have  
to associate an integer with each sequence (basically just put the  
sequences in some order and then number them starting from zero).  The  
output of -predict is then just (1) the response class with maximum  
probability; and (2) the probabilities of all the response classes.

>
> 4. "Define a small set of feature functions..." Does it mean that  
> converting word_suffix into the format like sample_data_file? Each  
> phoneme is represented as   "i # f_11 r_11..." according to features  
> we define. And then we can use this file in the MegaM..

Yes, that's right.

Don't hesitate to ask more!

--

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy










More information about the Lign274 mailing list