Dr. Linda Moreau provided a glimpse of work that occurs at the intersection of data science and natural language processing. She illustrated the importance of seemingly low-level linguistic processing to the success of data scientific algorithms through a series of case studies involving the retargeting to other languages of algorithms originally developed for English. Among the topics touched upon were cross-cultural name matching, Arabic proximity search, Chinese IR for term highlighting, Korean word similarity, and emoji processing for sentiment analysis.


Dr. Linda Moreau (née Van Guilder) is a Principal Computational Linguist at the MITRE Corporation who has 25 years of industry experience developing and deploying language technologies and who periodically serves as an adjunct professor for Georgetown University's Department of Linguistics. Dr. Moreau earned her Ph.D. from Georgetown in 2007, with dissertation research focused on the applicability of cross-language speech perception to computational problems such as name matching for cross-cultural identity resolution. Throughout her career, Dr. Moreau has been involved in a number of specialty areas within the realm of Natural Language Processing (NLP), including information extraction, automatic summarization, Arabic handwriting recognition, machine translation and identity resolution. Her current work involves a blend of natural language systems engineering and NLP, with a goal of maximizing the success of data science analytic techniques as they are incorporated into workflows that process multilingual data.

  • Georgetown Hackathon participants
    Georgetown Co-Hosts Successful Hackathon

    Thanks to all the students who came out for our first hackathon, co-hosted by Georgetown Analytics, GWU Data Science, QED Group, and the Center for Global Data Visualization.

  • Data Viz Challenge
    Georgetown to Co-Host Data Visualization Challenge

    In collaboration with GWU and CGDV, Georgetown Analytics invites you to participate in a Data Visualization Challenge on April 12, 2019.

  • Georgetown Analytics Hosts Deloitte Core Consulting Series
    Georgetown Analytics Hosts Deloitte Core Consulting Series

    The MS Analytics program is partnering with the Deloitte Foundation to bring an exciting opportunity to Georgetown's campus on March 22 & 29, 2019.

  • Linguistic Diversity Around The World
    One Text / Two Languages

    This workshop covered how linguists gather, process and analyze code-switched data, exploring the NPL pipeline for processing multilingual texts and discussing various approaches to language identification.