Student Speaker Abstracts
Farshid Abadizaman | Ph.D. candidate in Applied Mathematics- Georgetown University
Talk: Efficient Bayesian Variable Selection under Predictor Dependence Regimes
We propose an efficient Bayesian variable selection framework that incorporates dependence structure among predictors while remaining computationally scalable. This is accomplished in the context of a discrete spike-and-slab prior formulation. Unlike binary Markov random field priors, commonly used in this setting, our approach exhibits improved robustness to hyper-parameter specification. Computational scalability is achieved through an algorithmic design that exploits the algebraic structure of the likelihood–prior formulation to enable fast and numerically stable updates of posterior quantities as the model evolves, thereby eliminating the need for repeated matrix inversions while preserving accurate posterior inference. We demonstrate the performance of the proposed method and its computational advantages through
simulation studies involving thousands of samples and hundreds of thousands of covariates, and we further illustrate its practical utility in applications to genomic studies.
Zixiang Xu | Ph.D. candidate in Statistics- George Mason University
Talk: Data Shift Problems Viewed from the Perspective of Selection Bias
Data shift problems are a popular research topic today as they frequently occur when training modern models with multi-domain data sources. This talk will introduce data shift problem settings and their connection to selection bias problems, with a focus on the “covariate shift” case using toy-example simulation results. If time permits, we will also discuss the “posterior shift” case.
Zeynep “Zedo” Yilmaz | UG in Data Science- American University
Talk: Predicting and Preventing Sports Injuries: A Data-Driven Classification Approach
Sports injuries remain a major concern for athletes, coaches, and medical professionals,
often impacting both performance and long-term health outcomes. In this study, we analyze athlete-level data to model the probability of injury as a function of training characteristics, such as exercise load and warm-up practices. We employ classification methods to predict injury risk and evaluate the out-of-sample predictive power. In addition, we investigate the relative importance of covariates to identify key factors associated with increased or decreased injury risk. Our results highlight the role of training intensity and preparation routines. These findings demonstrate how data-driven approaches can inform personalized training strategies and contribute to injury prevention.
Jilei Lin | Ph.D. candidate in Statistics- George Washington University
Talk: Censored Bent-line Quantile Regression with Application in Experimental Autoimmune Myasthenia Gravis Studies
Experimental autoimmune myasthenia gravis (EAMG) is an established animal model for studying the progression of MG, a chronic autoimmune neuromuscular disorder, and for developing effective treatments. In EAMG studies, weight trajectories exhibit biologically distinct phases: an initial pre-chronic period followed by chronic sub-phases with differing rates of change. These structured transitions motivate modeling the relationship between weight and treatment time using a bent-line regression framework. However, the analysis is complicated by ethical and experimental protocols requiring early euthanasia of severely affected mice, which introduces monotone missingness in the weight data. Standard approaches that ignore this mechanism yield biased estimates of both slopes and change points. We develop a censored bent-line quantile regression framework with a simple estimation procedure. By leveraging the appealing identifiability of quantiles under censoring, the method consistently identifies pre-chronic and chronic sub-phases without making parametric distributional assumptions. We establish the consistency and asymptotic normality of the proposed estimator and develop a bias-corrected approach for constructing confidence intervals. Simulation studies and application to the EAMG data demonstrate that the proposed method substantially reduces bias and improves inference in evaluating the efficacy of an antigen-specific immunotherapeutic vaccine.