Data Scientist | Computational Biologist
I am an interdisciplinary data scientist with a PhD in Genetics, specializing in computational biology, multi-omics analysis, and machine learning. I have hands-on experience developing scalable data pipelines, integrating genomics, transcriptomics, and proteomics data, and deploying machine learning models for cancer detection, disease monitoring, and evaluating treatment response. I work with cross-functional teams to apply data science in driving scientific discovery and innovation.
Programming Languages:
Python, R, MATLAB, SQL (PostgreSQL)
Machine Learning:
Supervised and unsupervised techniques (e.g., SVM, Random Forest, Boosted Trees, softmax, LDA, NMF), hyperparameter tuning, model evaluation
Data Science & Analytics:
Multi-omics data analysis, feature engineering, data preprocessing, statistical analysis, cross-validation, model deployment, data fusion
Bioinformatics:
CellRanger, BedTools, ScanPy, fgsea
Cloud Computing & Tools:
AWS, GCP, Azure, Flyte, Conda, Poetry, UV
Frameworks & Libraries:
PyTorch, scikit-learn, Pandas, NumPy, SHAP, Matplotlib, Seaborn
Operating Systems:
Linux (Ubuntu, CentOS), macOS