Data Scientist | Computational Biologist
Iām a computational scientist working across discovery biology, translational research, and clinical development. I work with integrated clinical and molecular datasets, combining machine learning and statistical inference to study how biological signals are identified, evaluated, and interpreted in clinical contexts.
My work spans biomarker modeling, multi-omics analysis, and the development of reproducible analytical frameworks for large-scale molecular and clinical datasets, including genomics, transcriptomics, proteomics, and cfDNA-based assays.
I enjoy building scalable computational systems ā including distributed workflows and orchestration frameworks ā that make complex scientific analyses more reliable, interpretable, and collaborative.
My background in systems neuroscience and genomics originally drew me toward questions involving high-dimensional biological systems and quantitative modeling. Over time, that evolved into a broader interest in translational machine learning and computational approaches that support real-world scientific and clinical decision-making.
Programming Languages:
Python, R, MATLAB, SQL (PostgreSQL)
Machine Learning:
Supervised and unsupervised techniques (e.g., SVM, Random Forest, Boosted Trees, softmax, LDA, NMF), hyperparameter tuning, model evaluation
Data Science & Analytics:
Multi-omics data analysis, feature engineering, data preprocessing, statistical analysis, cross-validation, model deployment, data fusion
Bioinformatics:
CellRanger, BedTools, ScanPy, fgsea
Cloud Computing & Tools:
AWS, GCP, Azure, Flyte, Conda, Poetry, UV
Frameworks & Libraries:
PyTorch, scikit-learn, Pandas, NumPy, SHAP, Matplotlib, Seaborn
Operating Systems:
Linux (Ubuntu, CentOS), macOS