<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2900.3268" name=GENERATOR>

<STYLE></STYLE>

</HEAD>

<BODY bgColor=#ffffff>

<DIV><FONT size=2>OK, I finally see a bit better what's going on here.&nbsp; 

</FONT><FONT size=2>I can see why there's 48*48 source states to consider at 

each point.&nbsp; </FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>I can see that the "position" property of the State simply 

encodes where we are in the trellis.&nbsp; This is necessary because the 

transitions between states depends on the current word, and that is basically 

kept track of in the State by the position.&nbsp; If you remove the position, 

then States with the same two previous tags (but different positions) get 

mixed.&nbsp; In a bad case, this can lead to a trellis with a loop, and that 

winds up putting GreedyDecoder into an infinite loop.</FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>In my local copy, I have renamed "State" to "TrellisState", as 

a State is completely tied to building and decoding the trellis.&nbsp; I didn't 

really understand&nbsp;that; I thought State was a more general thing.&nbsp; I'm 

also confused about the Interner, as I still believe it is increasing memory 

about 25x normal and I can't imagine it's saving much time at all.&nbsp; I 

believe it's a hash table of states that gets way more new entries than it ever 

gets hits, due to the fact that States are unique on prevprevtag, prevtag, and 

position combinations.&nbsp; On the other hand, although I don't see why at all, 

the "canonicalization" it does looks to be necessary.&nbsp; Why? I don't 

know.</FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>In fact, after reverting all the Interner code I changed, I 

was able to get the original code to work again :) Now back to the HMM... I 

swear it should work! :P</FONT></DIV>

<DIV><FONT size=2></FONT>&nbsp;</DIV>

<DIV><FONT size=2>Ben</FONT></DIV>

<BLOCKQUOTE 

style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">

  <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>

  <DIV 

  style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B> 

  <A title=bcipolli@cogsci.ucsd.edu href="mailto:bcipolli@cogsci.ucsd.edu">Ben 

  Cipollini</A> </DIV>

  <DIV style="FONT: 10pt arial"><B>To:</B> <A title=ligncse256@ling.ucsd.edu 

  href="mailto:ligncse256@ling.ucsd.edu">ligncse256@ling.ucsd.edu</A> </DIV>

  <DIV style="FONT: 10pt arial"><B>Sent:</B> Thursday, February 21, 2008 8:38 

  PM</DIV>

  <DIV style="FONT: 10pt arial"><B>Subject:</B> [Ligncse256] HW2: buildTrellis 

  method</DIV>

  <DIV><BR></DIV>

  <DIV><FONT size=2>Hey all,</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>I'm trying to debug some issues with my HMM scorer.&nbsp; In 

  doing so, the buildTrellis method of the POSTagger doesn't make sense to 

  me.&nbsp; I was wondering if anybody's taken a look.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>Here's some basics:</FONT></DIV>

  <DIV><FONT size=2>* I'm using a subset of the training data for debugging 

  purposes.</FONT></DIV>

  <DIV><FONT size=2>* There are 48 unique tags in the training dataset that I'm 

  using</FONT></DIV>

  <DIV><FONT size=2>* There are 1201 unique combinations of tags</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>What I expect to do in building a trellis is:</FONT></DIV>

  <DIV><FONT size=2>* T0: start at the "START" state, and </FONT><FONT 

  size=2>calculate the probability of transitioning to each of the 48 unique 

  tags</FONT></DIV>

  <DIV><FONT size=2>* T1: for each of the 48 tags, determine the probability of 

  transitioning to each of the 48 unique tags.</FONT></DIV>

  <DIV><FONT size=2>* T2: ...</FONT></DIV>

  <DIV><FONT size=2>*T{end}: for each of the 48 tags, determine the probability 

  of transitioning to the end state.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>However, this is what I see buildTrellis doing:</FONT></DIV>

  <DIV><FONT size=2>* T0: same</FONT></DIV>

  <DIV><FONT size=2>* T1: same</FONT></DIV>

  <DIV><FONT size=2>* T2: for each of the 48*48 possibilities of previous 

  transitions COMBINATIONS, calculate the transition to each of the 48 

  states.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>I think this is happening because states are unique based 

  not on the previous state, but on the previous state and previous-previous 

  state.&nbsp; </FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>I was expecting, however, that this shouldn't be--the 

  probability of transition depends on the previous-previous and previous tag, 

  but does the trellis need an extra edges or something?&nbsp; I figured the 

  trellis would have the same exact architecture as the HMM trellis we discussed 

  in class, and thus always compute transitions from 48 states to 48 other 

  states (48^2 such computations).&nbsp; </FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>In addition, the State class encodes sentence 

  position.&nbsp; For the HMM model, this is never used.&nbsp; This winds up 

  completely exploding caching of states to try and save memory, leading to 

  maybe 20x extra memory used on State objects that is completely unnecessary in 

  the HMM case, if I understand properly.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>Has anybody sifted through this code or can provide feedback 

  about my expectations?&nbsp; I'm not sure if I'm conceptually missing 

  something.&nbsp; This whole State-Trellis object system seems unnecessarily 

  complex and weighty for what I'm trying to do, and it's really making 

  debugging a big chore for me.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>Lastly, is there any principled reason that the State object 

  uses names previousTag and previousPreviousTag instead of curTag and 

  previousTag?&nbsp; Even the startState has prevprev as START and prev as START 

  (no current).&nbsp; The semantics seem strange, and a function getNextState 

  combines a passed-in curTag with previousTag instead of combining a nextTag 

  with a curTag.&nbsp;&nbsp;&nbsp;I can't help but feeling I'm missing some 

  semantics in what's intended.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>Please anybody respond to any part of this mail.&nbsp; Just 

  throwing some thoughts out there.&nbsp; And my apologies if I'm so far off 

  that I'm not even making sense.</FONT></DIV>

  <DIV><FONT size=2></FONT>&nbsp;</DIV>

  <DIV><FONT size=2>Thanks in advance,</FONT></DIV>

  <DIV><FONT size=2>Ben</FONT></DIV>

  <P>

  <HR>


  <P></P>_______________________________________________<BR>Ligncse256 mailing 

  list<BR>Ligncse256@ling.ucsd.edu<BR>http://pidgin.ucsd.edu/mailman/listinfo/ligncse256<BR></BLOCKQUOTE></BODY></HTML>