<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2900.3268" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT size=2>Hey all,</FONT></DIV>

<DIV>&nbsp;</DIV>

<DIV><FONT size=2>While trying to finish my writeup for HW2 this morning, I 

decided to write some code to output a confusion matrix.&nbsp; Thought this 

might be useful to anybody who's still playing in this framework...</FONT></DIV>

<DIV>&nbsp;</DIV>

<DIV><FONT size=2>You can see a demo at <A 

href="http://www.cogsci.ucsd.edu/~bcipolli/WI08/LIGN%20256/hw2/conf_matrix.html">http://www.cogsci.ucsd.edu/~bcipolli/WI08/LIGN%20256/hw2/conf_matrix.html</A></FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>Roger, I'll be sending you this as part of the overall source 

changes I made (with a bit of documentation) at the end of the quarter 

sometime.</FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>=====================</FONT></DIV>

<DIV>&nbsp;</DIV>

<DIV><FONT size=2>I've attached the java file; drop it into 

src/edu/berkeley/nlp/util.&nbsp; </FONT><FONT size=2>Here's a code snippet of 

how I used it, in POSTaggerTester:</FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV><FONT size=2>

<DIV>In POSTaggerTester::evaluateTagger,</DIV>

<DIV><BR>&gt;&gt;&nbsp;&nbsp;&nbsp; ConfusionMatrix&lt;String, String&gt; 

confusionMatrix = new ConfusionMatrix&lt;String,String&gt;();</DIV>

<DIV>&nbsp;</DIV>

<DIV>&nbsp; Then later&nbsp;@ line 563 (or anywhere inside loop over integer 

position in sentence)</DIV>

<DIV>&nbsp;&nbsp;<BR>&nbsp;&nbsp;&nbsp; if 

(guessedTag.equals(goldTag))<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; numTagsCorrect += 

1.0;<BR>&gt;&gt;&nbsp; confusionMatrix.incrementCount(guessedTag, 

goldTag);<BR>&nbsp; <BR>&nbsp; Note that the inserted statement doesn't go into 

the "if" statement.<BR>&nbsp; <BR>&nbsp; Then, at the bottom of the 

function:<BR>&gt;&gt;&nbsp; if (verbose)<BR>&gt;&gt;&nbsp;&nbsp;&nbsp; 

System.out.println("Confusion Matrix: \n" + 

confusionMatrix.toString(toChange));<BR>&nbsp; <BR>&nbsp; <BR>In order to 

"group" tags according to "category" (which I defined), before my println 

statement, I added:</DIV>

<DIV>&nbsp;</DIV>

<DIV>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HashMap&lt;String,List&lt;String&gt;&gt; 

toChange = new HashMap&lt;String, 

List&lt;String&gt;&gt;();<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; List&lt;String&gt; 

toOther&nbsp; = Arrays.asList("-LRB-", "-RRB-", "EX", "FW", "LS", "MD", "PDT", 

"WP", "WP$", "UH", "WRB");</DIV>

<DIV>&nbsp;</DIV>

<DIV>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; toChange.put("[ADJ]", 

PennTreebankReader.adjTags);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 

toChange.put("[ADV]", 

PennTreebankReader.advTags);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 

toChange.put("[NOUN]", 

PennTreebankReader.nounTags);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 

toChange.put("[PUNCT]", 

PennTreebankReader.punctTags);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 

toChange.put("[VERB]", 

PennTreebankReader.verbTags);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 

toChange.put("[OTHER]", toOther);<BR>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <BR>&nbsp; 

Where I modified edu.berkeley.nlp.io.PennTreebankReader to have the static 

members:<BR>&nbsp;&nbsp;&nbsp; public static List&lt;String&gt; adjTags = 

Arrays.asList("JJ", "JJR", "JJS");<BR>&nbsp;&nbsp;&nbsp; public static 

List&lt;String&gt; advTags = Arrays.asList("RB", "RBR", 

"RBS");<BR>&nbsp;&nbsp;&nbsp; public static List&lt;String&gt; nounTags = 

Arrays.asList("NN", "NNP", "NNPS", "NNS");<BR>&nbsp;&nbsp;&nbsp; public static 

List&lt;String&gt; verbTags = Arrays.asList("VB", "VBD", "VBG", "VBN", "VBP", 

"VBZ");<BR>&nbsp;&nbsp;&nbsp; public static List&lt;String&gt; punctTags = 

Arrays.asList("#", "$", "\"", ",", ".", ":", "''", "``");</DIV>

<DIV>&nbsp;</DIV>

<DIV>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <BR>This simply defines grouping / rewrite 

rules within the confusion matrix.&nbsp; There's also a way to simply suppress 

printing of rows/columns in the confusion matrix.</DIV>

<DIV>&nbsp;</DIV>

<DIV>OK, not sure if that's useful, easy, hard, etc.&nbsp; Please feel free to 

email with any quesitons.</DIV>

<DIV>&nbsp;</DIV>

<DIV>Ben</FONT></DIV></BODY></HTML>