Dr. Linda Moreau provided a glimpse of work that occurs at the intersection of data science and natural language processing. She illustrated the importance of seemingly low-level linguistic processing to the success of data scientific algorithms through a series of case studies involving the retargeting to other languages of algorithms originally developed for English. Among the topics touched upon were cross-cultural name matching, Arabic proximity search, Chinese IR for term highlighting, Korean word similarity, and emoji processing for sentiment analysis.


Dr. Linda Moreau (née Van Guilder) is a Principal Computational Linguist at the MITRE Corporation who has 25 years of industry experience developing and deploying language technologies and who periodically serves as an adjunct professor for Georgetown University's Department of Linguistics. Dr. Moreau earned her Ph.D. from Georgetown in 2007, with dissertation research focused on the applicability of cross-language speech perception to computational problems such as name matching for cross-cultural identity resolution. Throughout her career, Dr. Moreau has been involved in a number of specialty areas within the realm of Natural Language Processing (NLP), including information extraction, automatic summarization, Arabic handwriting recognition, machine translation and identity resolution. Her current work involves a blend of natural language systems engineering and NLP, with a goal of maximizing the success of data science analytic techniques as they are incorporated into workflows that process multilingual data.

  • Linguistic Diversity Around The World
    One Text / Two Languages

    This workshop covered how linguists gather, process and analyze code-switched data, exploring the NPL pipeline for processing multilingual texts and discussing various approaches to language identification.

  • Analytics student Ratnadeep Mitra stands in front of his poster next to program director Dr. Ami Gates.
    Analytics Student Showcase

    MS Analytics students showcased what they learned this semester by applying contemporary deep learning techniques to a variety of exciting topics and problems.

  • Neural attention model
    Commonsense Reasoning without Commonsense Knowledge

    Reading comprehension tasks measure the ability to answer questions that require inference from a free text story. This talk explores two machine learning approaches.

  • MS Analytics Students Selected as Finalists in Adobe Analytics Challenge
    MS Analytics Students Win Third Place in Adobe Analytics Challenge

    Shengye Hang, Ju Huang, and Tianyi Yang represented Georgetown as Team "Don't Stop Bayesian" at the 2018 Adobe Analytics Challenge.