
Portfolio
Projects
Current work and past deliverables from our members
Active
Current Projects
Chatbot that leverages Retrieval Augmented Generation (RAG) to provide up-to-date information about US legislation and policy using bills, hearings, and voting polls. Integrates recent data scraped from the US Congress website to answer policy-related questions more accurately than a standard LLM.
This project focuses on unifying diffusion and optimal transport frameworks, including Flow Matching, Schrödinger Bridges, and DDPMs, to simplify diffusion model training and inference. It explores DiT architectures and optimized sampling methods to improve performance.
This project develops a personalized AI news digest system that uses user-interaction data to generate timed news summaries and recommend articles. It includes article ingestion, cleaning, embedding, user-interest modeling, and ranking pipelines using RSS feeds and vector similarity search.
AI-enabled tour guide web app that allows users to take pictures of Duke campus buildings and receive short summaries with up-to-date information. Uses CLIP for building recognition, LLMs for chatbot responses, and real-time location tracking for campus navigation.
AI-powered interface for competitive sports teams using computer vision techniques, including object detection and player tracking, to analyze sports video, map court locations, and segment game footage into player-specific possession clips.
Develops a multimodal neural controlled differential equation (NCDE) model to perform optical flow, interpolation, and extrapolation of medical image sequences to forecast disease progression in Alzheimer's patients using PET and MRI data.
Internal research initiative evaluating whether Vision-Language-Action (VLA) models can correctly interpret relational comparison instructions between objects using controlled two-object scenes in simulation environments like SimplerEnv or LIBERO.
Audio classification system that predicts the genre of a 30-second music clip using machine learning models trained on extracted audio features. Uses the FMA small dataset for preprocessing, feature extraction, label generation, and model evaluation.
A project dedicated to building an AI chess engine that can analyze and play chess games from scratch.
Financial prediction model that forecasts whether a stock will move up or down on the next trading day using company-related news headlines and historical price behavior. Combines news-based, price-based, and ensemble modeling approaches.
This project focuses on predicting changes in prediction market probabilities for events such as recessions, policy decisions, and geopolitical outcomes using market history and macroeconomic signals. It combines Polymarket data with financial indicators like VIX, gold, oil, and Bitcoin, and evaluates OLS, RidgeCV, and XGBoost models across multiple forecasting horizons.
Builds a multi-label text classification model (the D4 Classifier) that automatically assigns OpenAlex "Topic" tags to manuscripts based on abstracts. Trained on a high-precision "Silver Set" dataset to predict niche scientific topics.
Machine learning pipeline that automatically detects and removes speech dysfluencies, or stuttering. Uses Whisper-generated transcripts, a dysfluency detection model trained on labeled data, and time-domain audio removal to create fluent speech and transcripts.
Develops a Softmax/Dirichlet regression model that predicts the exact percentage of contribution each author made to a manuscript based on their CRediT roles. Initially pre-trained on synthetic data with an active training loop for continuous learning.