Sanjeeva Reddy Dodlapati

Research Scientist - AI/ML for Genomics & Drug Discovery

📍 Norfolk, VA • 📧 sdodl001@odu.edu • sdodlapa@gmail.com • 📱 +1-757-364-1561

GitHub • LinkedIn • Google Scholar • Website • Download PDF

Professional Summary

Research Scientist with 6+ years of experience in deep learning, NLP, genomics, and drug discovery. Proven track record in leading multi-disciplinary research projects, publishing in peer-reviewed journals, and contributing to open-source ML frameworks. Specialized in uncertainty modeling, transfer learning, and scalable ML pipelines. Passionate about advancing fundamental research and translating innovations into real-world impact.

Skilled in designing scalable ML pipelines and deploying models to production using CI/CD, Docker, MLflow, and Hugging Face on cloud platforms (AWS, Azure, GCP). Experienced in A/B testing, experiment tracking, and model performance evaluation aligned with business goals.

Collaborated with multiple research teams resulting in 4 peer-reviewed publications and 3 conference presentations. Continuous learning through writing blogposts on AI for Science and earning 40+ ML course certifications.

Education

Doctor of Philosophy (PhD) in Computer Science
Old Dominion University, Norfolk, VA

Aug 2019 - July 2025
GPA: 3.9/4.0

Master of Science (MS) in Computer Science
Georgia Institute of Technology, Atlanta, GA

May 2023 - Present
GPA: 3.5/4.0

Professional Experience

Graduate Research Assistant

Old Dominion University

Aug 2019 - Present Norfolk, VA

Project I: Transfer Learning for Methylation Prediction (Publication 1 & 3)

Developed novel transfer learning method for DNA methylation using transformer models, improving F1-score by 38% over state-of-the-art and expanding methylome coverage from 1.5% to 50% in sparse single-cell data
Integrated multi-omics data (WGS, RNA-seq, ATAC-seq) using graph neural networks
Collaborated with University of Michigan researchers, contributing to 2 peer-reviewed publications

Project II: Data-Centric AI for Cost Reduction (Publication 4)

Implemented data-centric AI framework optimizing training data, achieving 50% reduction in data needs with maintained performance
Reduced computational costs by 65-80% through efficient pipeline design
Published findings in peer-reviewed journal with reproducible benchmarks

Project III: Uncertainty Quantification for Genomic Variants

Developed uncertainty-aware deep learning models using Bayesian neural networks and Monte Carlo dropout
Achieved 80% cost reduction in variant prioritization through reliable confidence estimates
Resulted in 2 submitted manuscripts under review

Project IV: Collaborative Research with Louisiana State University - Contributed to multi-omics integration using machine learning (1 publication) - Analyzed single-cell RNA-seq and epigenetic data for disease mechanisms

Project V: Collaborative Research with University of Michigan - Co-authored transfer learning methylation paper (Publication 3) - Worked on cancer and cardiovascular disease applications

Research Leadership & Impact - Led cross-functional teams combining biologists, clinicians, and computer scientists - Mentored undergraduate and graduate students on ML projects - Winner of 2023 Speed Notes Competition (Best Mentor Award)

Independent Research Projects

OmicsOracle

Agentic AI

Autonomous AI system for automated genomic data extraction, analysis, and interpretation from scientific literature

LangChain GPT-4 RAG Python

Drug-Drug Interaction Prediction

Healthcare AI

Graph neural networks for predicting adverse drug interactions and improving medication safety

Graph-NN PyTorch RDKit

ML4Trading

ML Systems

Algorithmic trading system using reinforcement learning and decision trees for financial market predictions

Reinforcement Learning Decision Trees Python

ClinicalNormBERT

Healthcare AI

Transformer-based clinical text normalization achieving 92% accuracy on medical entity normalization

BERT PyTorch Hugging Face

COVID-19 Healthcare Analytics

Healthcare AI

Analytics platform for pandemic data analysis and visualization supporting healthcare decision-making

Scikit-learn Statistical Modeling Visualization

APT Prediction

ML Systems

Advanced persistent threat detection system using deep learning for cybersecurity applications

Deep Learning Security Python

Protein Structure Prediction

Bioinformatics

U-Net architecture for protein secondary structure prediction from sequence data

U-Net CNNs Bioinformatics

Portfolio Website

Web Development

Personal website showcasing research, projects, and blog posts on AI for Science

Quarto HTML/CSS JavaScript

Collaborative & Service Experience - Peer reviewer for NeurIPS, ICML, ICLR, IJCAI (2021-2024) - Teaching Assistant for multiple CS courses (data structures, algorithms, machine learning) - Mentored students resulting in Best Mentor Award (April 2023) - Cross-functional team collaboration with biologists, clinicians, and software engineers

Research Intern

Boehringer Ingelheim Pharmaceuticals

May 2018 - Aug 2018 Connecticut

Developed chiral drug candidates for respiratory disease treatments using computational cheminformatics
Achieved >99% enantioselectivity in asymmetric synthesis of sulfanilamide derivatives
Optimized molecular structures using machine learning and rational design approaches

Publications

Published (4 papers)

1. Dodlapati, S. R., et al. “Enhancing methylation prediction in sparse single-cell data through transfer learning.” Frontiers in Genetics, 2022, vol. 13, p. 910439.

2. Dodlapati, S. R., et al. “Epigenetic regulation in cardiovascular disease: Role of histone modifications.” Epigenetics, 2022, vol. 17, no. 9, pp. 1020-1039.

3. Dodlapati, S. R., et al. “Multi-omics integration for cardiac disease mechanisms.” Journal of Molecular and Cellular Cardiology, 2022, vol. 171, pp. 117-132.

4. Dodlapati, S. R., et al. “Synthesis of chiral sulfanilamide derivatives with applications in drug discovery.” European Journal of Organic Chemistry, 2019, vol. 2019, no. 6, pp. 1189-1194.

In Progress (3 papers)

1. Dodlapati, S. R., et al. “Data-centric AI approaches for optimizing genomic training datasets.” (In preparation)

2. Dodlapati, S. R., et al. “Uncertainty quantification in deep learning models for variant effect prediction.” (In preparation)

3. Dodlapati, S. R., et al. “Agentic AI systems for automated literature-based genomic data extraction.” (Under review)

Technical Skills

Core Expertise

Python PyTorch TensorFlow Deep Learning Transformers R Scikit-learn NLP Transfer Learning

ML/AI Stack

LLMs CNNs RNNs/LSTMs Graph-NN Generative Models Reinforcement Learning Multi-task Learning DeepSpeed Hugging Face NLTK

Bioinformatics & Data Science

Bioconductor DESeq2 Samtools RDKit Deepchem Pandas Numpy SciPy ggplot2 Matplotlib

Cloud & DevOps

AWS Azure GCP Docker MLflow Amazon Sagemaker GitHub Spark Hadoop Snowflake

Languages & Web Frameworks

Java JavaScript C/C++ SQL Bash Flask Django FastAPI HTML/CSS Quarto

Honors & Awards

Best Mentor Award, Old Dominion University (April 2023)
CSIR-INDIA Junior Research Fellow (March 2008 - December 2008)
5+ IPR certificates, World Intellectual Property Organization (2016-2017)
40+ AI/ML course certificates, edx/coursera (2016-Present)

Professional Service

Peer Reviewer
NeurIPS, ICML, ICLR, IJCAI (2021-2024)

Certifications & Continuous Learning

40+ Professional Certifications from leading platforms:

Coursera (DeepLearning.AI, Google, IBM, Stanford) - Agentic AI with Langraph, RAG with LlamaIndex, Google Prompting Essentials - DevOps & MLOps with Python, MLOps Tools: MLflow, Hugging Face - Genomic Technologies, Python for Genomic Data Science - Generative AI with Langchain, LangChain Chat with Your Data - Build a Portfolio Website with HTML and CSS - Spark, Hadoop, Snowflake Specializations

edX (Harvard, Microsoft) - C Programming: Getting Started (DART.IMT.C), Modular Programming (IMTx), Using Linux Tools (Dartmouth) - Data Science: R Basics (PH125.1x), Visualization (PH125.2x), Probability (PH125.3x), Inference (PH125.4x), Productivity Tools (PH125.5x), Wrangling (PH125.6x), Linear Regression (PH125.7x) - DAT101x-DAT210x Series, DS101X-DS103x Series

Plus 10+ additional ML/AI specialized certifications covering deep learning, NLP, reinforcement learning, and bioinformatics

Research Interests

Artificial Intelligence for Drug Discovery and Healthcare
Uncertainty Quantification in Deep Learning Models
Transfer Learning and Few-Shot Learning in Biological Applications
Multi-Agent AI Systems and Large Language Models
Single-Cell Genomics and Epigenomics Analysis
Graph Neural Networks for Molecular Property Prediction
Data-Centric AI and Training Data Optimization

📞 Contact & Links

Email: sdodl001@odu.edu | sdodlapa@gmail.com
GitHub: github.com/SanjeevaRDodlapati
LinkedIn: linkedin.com/in/sanjeeva-reddy-dodlapati
Google Scholar: View Publications
Website: sanjeevareddydodlapati.com