学术报告
时间: 2013-11-15 发布者: 文章来源: 欧洲杯买足球软件 审核人: 浏览次数: 662

报告题目:Topic Models with Discourse Constraints

报告人:杜岚

时间:2013年11月18日(星期一)下午14:00

地点:校本部理工楼321室

报告摘要:Natural language text usually exhibits some internal structures, where topically structured and coherent segments consist of semantic text units that are closely related. For example, a section is a group of paragraphs that addressing similar topics. Topics addressed in documents do not appear in a random order. Instead they are discussed in an order such that reader’s comprehension can be facilitated. Capturing this kind of discourse structure should lead to improved topic modeling, which can further benefit text analysis tasks, e.g., segmentation, sentence ordering and cross-document alignment. In this talk, I will discuss two novel modeling approaches that can leverage discourse constraints on the topic structure. One builds the constraints directly into the graphical structure by assuming the topic structure is observed. In contrast, another automatically learns the topic structure with a novel split/merge algorithm. Both have shown improved topic modeling accuracy in terms of either perplexity or text segmentation.

个人简介:杜岚博士现为澳大利亚麦考瑞大学Research Fellow,并为Google自然语言理解项目的负责人之一。杜岚博士2006年毕业于澳大利亚 Flinders 大学获得学士学位,并于2007年和2012年在澳大利亚国立大学 (ANU) 获得一等荣誉学士学位 (first class honours degree) 和哲学博士学位 (Ph.D)。2008年至2011年,杜博士工作于澳大利亚堪培拉NICTA机器学习研究组。杜岚博士的研究方向包括自然语言分析处理,文本分析挖掘等,并已在相关的顶级杂志和会议上发表文章20余篇,其中包括Knowledge and Information Systems, The Computer Journal, Machine Learning Journal, ICDM 等一流国际期刊和国际会议。


欧洲杯买足球软件