Course Descriptions

Data Science Core Courses

  • Advanced Programming Topics. The Georgetown Analytics program is giving an asynchronous, online course in programming preparation that covers R, Python, and command line use in the summer prior to matriculation. The course is equivalent to three credits, is designed for matriculating MS Analytics students, and is offered free of charge. It is required for incoming students who do not have a computer science degree and adequate preparation. Students admitted to the program will only have this requirement waived on discussion with the Program Director (Todd Leen) or Program Coordinator (Heather Connor). This course will run during Georgetown Summer Session II (July 10 - August 11). Students must complete this course to matriculate in the fall unless granted a waiver by the program.

  • ANLY-501: Introduction to Data Analytics. This course introduces students to several core data science concepts. It teaches students how to synthesize disparate, possibly unstructured data to better understand and characterize the world, and in some cases, to draw meaningful inferences. Topics covered include: the history of data science, successes and failures in data analytics, the data analytics life cycle, data/web scraping and APIs, data wrangling, data characterization (correlations, identifying clusters and associations), data inference and basic machine learning, network analysis, data ethics, and visual analytics. Students will work on a semester-long data science project that starts with question formulation and data collection, and goes through all the stages of the life cycle, culminating in data storytelling. The course also maps data science case studies to topics presented throughout the semester. Prerequisites: Intermediate coding experience in Python3, and knowledge of introductory statistics, 3 credits.

  • ANLY-502: Massive Data Fundamentals. Today's data scientists are commonly faced with huge data sets (Big Data) that may arrive at fantastic rates and in a broad variety of formats. This core course addresses the resulting challenges. The course will introduce students to the advantages and limitations of distributed computing and to methods of assessing its impact. Techniques for parallel processing (MapReduce) and their implementation (Hadoop) will be covered, as well as techniques for accessing unstructured data and for handling streaming data. These techniques will be applied to real world examples, using clusters of computational cores and cloud computing. Prerequisite: Working knowledge of Python and the Unix command line, some knowledge of data structures, 3 credits.

  • ANLY-503: Scientific and Analytical Visualization. Presenting quantitative information in visual form is an essential communication skill for data professionals. This course introduces representation methods and visualization techniques for complex data, drawing on insights from cognitive science and graphic design. Students will obtain an overview of the human visual system, learn to use models for data and for images, and acquire good design practices, such as those using the “grammar of graphics.” Students will use common statistical design tools such as graphic methods in Python3, interactive graphic methods such as Bokeh, Leaflet, and NetworkD3, the R package ggplot2, and Tableau. Prerequisites: ANLY-501, 3 credits.

  • ANLY-511: Probabilistic Modeling and Statistical Computing. Probabilistic models are essential for the understanding of data that are affected by uncertainty. This course introduces students to the fundamentals of probabilistic modeling and then covers computational techniques for the analysis of such data. After introducing basic concepts and approaches such as probability distributions, random variables, and conditioning, the course covers basic probability distributions that are frequently used in practice and some of their properties, such as Laws of Large Numbers. In the second half, students will learn about computational techniques for the use of probabilistic models. This includes methods for faithful simulation of random variables (Monte Carlo), the extraction of condensed models from observed data (maximum likelihood, Bayesian models), methods for models with hidden or partially observed variables (latent variables, expectation-maximization, hidden Markov models), and some general data science techniques that incorporate probabilistic models (graphical models, stochastic optimization). Prerequisites: Introductory statistics, some coding experience (e.g. R), 3 credits.

  • ANLY-512: Statistical Learning is concerned with algorithms that use statistical techniques to find structure or patterns in given data (unsupervised learning) or use given instances of data to predict outcomes in new cases (supervised learning). A well-known method of this type is linear regression, and this will be covered early in the course. Statistical methods for making discrete predictions (classification) such as logistic regression will also be covered. Special emphasis will be placed on techniques for handling high-dimensional data (i.e. instances with many attributes), including variable selection and dimension reduction. The course will also cover ensemble methods such as bagging and boosting that are often used to improve the results of given classification methods. Unsupervised methods covered in this course include model-based and hierarchical clustering. Prerequisites: ANLY-511, 3 credits.

Elective Courses

Analytics Electives

  • ANLY-520: Effective Presentation for Technology & Science. Clearly communicating problems, ideas, data, analysis approaches, results, and recommendations for action are vital for career success in technology and science. Strong technical writing is clear and unambiguous, easy to read, and concise. This course improves students’ writing, presentation, and critique skills. They will learn to communicate material to technical and non-technical audiences. Students will learn to write strongly by improving text clarity, simplicity, and conciseness, and incorporating high-quality graphics (LaTeX will be used for paper preparation). Students will learn to craft oral presentations that are clear, easy to follow, informative, and compelling, and will develop delivery skills that improve comprehension, audibility, comfort, and audience engagement, 3 credits.

  • ANLY-531: Databases. This course covers the theoretical design principles of modern database systems, the data structures and algorithms used in their implementation, and the techniques and tools used in designing databases. It is a comprehensive introduction to relational database modeling, relational design principles based on functional dependencies and normal forms, query languages including SQL, and database optimization techniques (indexing, views, and integrity constraints), 3 credits.

  • ANLY-550: Structures and Algorithms for Analytics. This course covers algorithmic techniques for solving different types of data science problems. It will cover Big O notation, data structures (arrays, stacks, queues, lists, trees, heaps, graphs), sorting and searching (binary search trees, hash tables), and algorithmic paradigms for efficient problem solving (divide and conquer, recursion, greedy algorithms, dynamic programming, etc.). It will include both theory and practice. You will learn to design, analyze, and implement fundamental data structures and algorithms. This course will provide the algorithmic background essential for further study of computer science topics. Prerequisites: ANLY-501 and ANLY-511, 3 credits.

  • ANLY-561: Optimization. Optimization is concerned with the general task of finding a set of parameters such that a given target function is made as small as possible or such that the fit with a desired goal is as close as possible. Such parameters can be numbers, but also character strings, geometric shapes, or paths in a network. These problems are ubiquitous in data science. Topics of this course include: Common mathematical optimization paradigms, efficient algorithmic techniques, and important Data Science applications of optimization over Euclidean spaces. The primary paradigms covered are Linear Programming, Convex Programming, and Semidefinite Programming. Algorithmic techniques include Line Searches, Gradient Descent, Newton's method, the Simplex Method, and Interior Point Methods. Various formulations of the least-squares problem are used to motivate theory and techniques throughout the course, and the course concludes with a selection of applications of optimization in Data Science (which may include Clustering, Community Detection, Dimension Reduction, Expectation Maximization, Latent Semantic Indexing, Neural Networks, Search, Spectral Embeddings, Stochastic Gradient Descent, Support Vector Machines, or Visualization depending upon student interest), 3 credits.

  • ANLY-570: Decision and Game Theoretic Analysis. This course will cover various models of incentives and optimization methods under uncertainty, when there is a single decision maker, and when there are multiple decision makers – decision trees, influence diagrams, sequential games, repeated games, simultaneous games, and applications to industrial organization, commitment, asymmetric information, auctions, mechanism design and bargaining. The course will then introduce you to data storage, processing and analysis (statistical, structured query and matrix) in SAS. Finally, the course will use SAS to apply decision and game theory to finance, including event studies, capital markets, asset pricing, insider trading, options, futures and other derivatives. Prerequisites: Multivariate calculus and probability, 3 credits.

  • ANLY-580: Natural Language Processing for Data Analytics. This course will cover the major techniques for mining and analyzing textual data to extract interesting patterns, discover knowledge, and support decision-making. Students will learn the main concepts and algorithms in Natural Language Processing and their applications in data science. These include search and information retrieval, document clustering and classification, topic modeling, sentiment analysis, and deriving meaning from unstructured narratives. In addition to traditional techniques in machine learning such as regression, decision trees, and Naive Bayes algorithms, the course will also examine the latest approaches in Deep Learning. Students will be given the opportunity to develop hands-on experience in building foundational tools and machine learning algorithms that can be applied to real analytics problems. The data obtained from textual content can be used to augment numerical data for the purposes of building predictive models, identifying emerging issues, detecting opinion, and determining important relationships. Prerequisites: Working knowledge of Python, ANLY-511 and ANLY-512 or their equivalent, 3 credits.

  • ANLY-590: Neural Networks and Deep Learning. This course will explore the fundamentals of artificial neural networks (ANNs) and deep learning. The following topics will be covered: feed-forward ANNs, activation functions, output transfer functions for regression and classification, cost functions and related likelihood functions, backpropagation and optimization (including stochastic gradient descent and conjugate gradient), auto-encoders for manifold learning and dimensionality reduction, convolutional neural networks, and recurrent neural networks. Overfitting and regularization will be discussed from both theoretical and practical viewpoints. Concepts and techniques will be applied to several domains including image processing, time series analysis, natural language processing, and more. Students will gain mastery of popular deep learning frameworks in the Python ecosystem including Tensorflow and Keras. Prerequisites: ANLY-511 and ANLY-512, fluency with Python, 3 credits.

  • ANLY-905: Internship. The ANLY Internship course permits the student to gain practical work experience in data analysis. Internships must be directly related to the student’s academic program goals and further both their practical and academic skills. Students must obtain the approval of the ANLY Program Director to register. Approved internships must be aligned with Analytics program and provide a significant learning experience for the student. At the end of the internship, the student must submit a deliverable to the course instructor, 0.25 credits.

Computer Science Electives

The following are computer science courses that could be appropriate for use towards elective requirements. Please be aware that courses part of the core computer science curriculum may have seating priority for CS students, or have prerequisite restrictions. You should speak directly with the course instructor to see if seating is available and if you satisfy prerequisites prior to semester enrollment.

Mathematics Electives

The following is a selection of mathematics courses that could be appropriate as electives. Some of these courses are heavily subscribed by students in the Math/Stats graduate program. You should speak directly with the course instructor to see if seating is available and if you satisfy prerequisites prior to semester enrollment.

​Other Department Electives

Please be aware that courses that are part of other departments' curriculum may have seating priority for their students, or have prerequisite restrictions. You should speak directly with the course instructor to see if you are eligible for enrolling.

Consortium Courses

Georgetown University graduate students may enroll for courses at other universities in the Washington, DC area through the Consortium of Universities of the Washington Metropolitan Area, provided the courses are not available at Georgetown University. You must obtain permission from this program, the Georgetown Graduate Dean, and the visited institution, and you cannot register for a Consortium course during Early Registration. Detailed rules are available on the Graduate School webpage.

The total of all transfer and consortium courses may not exceed 25% of the curriculum that is counted towards graduation. In addition, transfer and consortium courses do not count toward the Georgetown grade point average.

If you took a class at another area institution directly (not through the Consortium), you can ask for transfer of credit, subject to the 25% limit on transfer credit.