[Lign274] Final project topic and groups
Randy West
rdwest at ucsd.edu
Tue Feb 23 22:21:37 PST 2010
Hi Emily et al,
That sounds like an interesting topic as well. I had some significant
exposure to Bayesian surprise with respect to modeling saccades in a
cognitive modeling seminar course that I took last quarter, so I'd like
to see how it might be effectively integrated into a hierarchical model
to do reading time analysis. The Demberg and Keller model makes sense,
though I haven't had a chance yet to look at their results and analysis.
To everyone who's planning on doing a final project, I'd like to suggest
that we convene for a short meeting after class tomorrow to figure out
how we want to group off with each other. Hopefully that will give us
all enough time to submit proposals to Roger in a timely manner.
Best,
Randy
Emily Morgan wrote:
> Hi Randy et al,
> As it turns out, we're mostly all behind on this... Here's an idea I
> had for a project:
> I'm interested in eye movements during reading. There's a dataset
> called the Dundee corpus with eye-tracking data for people reading
> large amounts of text. There's been some work on predicting reading
> times from this dataset using hierarchical models, e.g.
> doi:10.1016/j.cognition.2008.07.008
> <http://dx.doi.org/10.1016/j.cognition.2008.07.008>. One of big
> questions is the roles of frequency versus contextual probability as
> predictors of reading time. (Of course both will on average lead to
> faster reading times, but what about a low-frequency word that's
> highly predictable in context, or a high-frequency word that's
> unpredicted given the context?) The linked paper above begins to
> answer this question. Another angle to approach this question from
> (with credit to Nathaniel Smith for suggesting this to me) is whether
> frequency and predictability would play different roles for different
> ranges of fixation durations: he suggests that frequency is playing a
> larger role for very short fixations, while contextual probability is
> playing a larger role in longer fixations. Hierarchical models seem
> like a great tool to address this question. Anyone interested in
> working on it with me? (I'm a first year student in the linguistics
> PhD program, by the way.)
>
> ~Emily
>
>
> 2010/2/22 Randy West <rdwest at ucsd.edu <mailto:rdwest at ucsd.edu>>
>
> Hi All,
>
> I might be a little bit late to the punch here, but I'm still
> looking for group member(s) for our final project. Briefly, I'm a
> first year Master's student in computer science with a focus on
> artificial intelligence. I have a little bit of formal training in
> English syntax, but other than that I have very little linguistics
> background. I've included an outline of my idea for a final
> project below, but if everyone has already formed groups then
> please let me know so that I can help out with one of your projects.
>
> Here's my idea:
>
> I'd like to do an analysis of the efficacy of various methods that
> we've covered in class for use in search engine technology. The
> dataset could be any collection of documents, but I was thinking
> of building a web-crawling script to run over, say, Google news
> for a few days and build up a database that way. The idea would be
> to produce for each model (or mixture of models) an ordering of
> documents in the dataset based on an ordered vector of search
> terms, i.e. a vector of documents ordered by p(document | search
> terms). The simplest such model would be the product of unigram
> likelihoods for each search term in the document, while something
> more complex might be using LDA to determine topic distributions
> over words and documents and leveraging those distributions for
> search.
>
> Please let me know what you all think, and again, if everyone has
> already settled into groups then please let me know so that I can
> help out.
>
> Best,
> Randy
>
> _______________________________________________
> Lign274 mailing list
> Lign274 at ling.ucsd.edu <mailto:Lign274 at ling.ucsd.edu>
> http://pidgin.ucsd.edu/mailman/listinfo/lign274
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Lign274 mailing list
> Lign274 at ling.ucsd.edu
> http://pidgin.ucsd.edu/mailman/listinfo/lign274
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://pidgin.ucsd.edu/pipermail/lign274/attachments/20100223/7fe6123c/attachment.htm>
More information about the Lign274
mailing list