MUFAKIR ANSARI
Ph.D. Candidate · Data Scientist · AI Researcher · Machine Learning Engineer

Career Brand Statement

Ph.D. Data Scientist and AI Researcher with 8+ years of combined academic and industry experience designing, building, and delivering machine learning systems across biomedical AI, scientific computing, high-performance computing, forecasting, and applied analytics. Demonstrated ability to translate complex research into reproducible, production-grade pipelines with measurable outcomes — including a clinical cancer detection model achieving AUC-ROC 0.950 over 277,000+ pathology image patches, an HPC-automated genomics pipeline processing 356 RNA-seq runs, and end-to-end ML delivery for seven SaaS clients with a 35% reduction in pipeline latency. Equally effective in data scientist, machine learning engineer, applied scientist, data analyst, and AI research roles where rigorous analysis, strong quantitative reasoning, and production discipline are required.


Professional Value Proposition

I build ML and data systems that are measurable, documented, and repeatable — not one-off experiments. My background spans the full pipeline from raw data ingestion and feature engineering through model training, evaluation, deployment, and monitoring. I bring strong statistical foundations (A/B testing, causal inference, Bayesian methods), hands-on experience with large-scale compute environments (SLURM, GPU clusters, AWS, Azure Databricks), and a research record that bridges academic rigor with engineering pragmatism. I am most effective on teams that value clear metrics, honest evaluation, and systems that actually work in production.


Target Job Titles
Data Scientist Machine Learning Engineer Applied Scientist AI Research Scientist Data Analyst Analytics Engineer Research Engineer ML Research Scientist NLP Engineer Computer Vision Engineer Quantitative Analyst Computational Scientist

Core Competencies
Data Science Machine Learning Deep Learning Statistical Modeling Data Analysis A/B Testing Causal Inference Forecasting Time Series Analysis Natural Language Processing Large Language Models Computer Vision Retrieval-Augmented Generation Feature Engineering Model Evaluation Experiment Design Scientific Computing Biomedical AI High-Performance Computing Distributed Systems MLOps Reproducible Research Research to Production

Technical Skills

Languages: Python, SQL, PySpark, R, Rust, CUDA, C++, Java

ML / AI Frameworks: PyTorch, TensorFlow, Scikit-learn, XGBoost, LightGBM, Hugging Face Transformers, LangChain, SimCLR, FAISS, ChromaDB

Data & Cloud: AWS, Azure Databricks, GCP, Apache Spark, Airflow, MLflow, Docker, Git, PostgreSQL, MySQL, MongoDB, Neo4j

HPC & Systems: SLURM, MPI, GPU Clusters, HISAT2, featureCounts, Kallisto, bcftools


Key Accomplishments
  • 0.95
    AUC-ROC 0.950 — Built clinical-grade invasive ductal carcinoma detection pipeline processing 277,000+ pathology image patches using SimCLR domain-specific pretraining. False negative rate as low as 0.34%. Explainability via Grad-CAM, UMAP, and t-SNE.
  • 356
    356 RNA-seq SRA runs — Designed and automated end-to-end Ebola outbreak genomics pipeline on Ohio Supercomputer Center SLURM HPC cluster (HISAT2, Kallisto, bcftools). Delivered 14 publication-grade outputs with checkpoint-based resume.
  • 35%
    35% pipeline latency reduction — Led ML and ETL delivery for 7 SaaS clients as Technical Lead at Orcinus IT Solutions. Improved deployment reliability by 30% and reduced pipeline failures across production environments.
  • 3.91
    GPA 3.91/4.00 — M.S. Computer Science (AI Track), University of Toledo. Combined with 28 Google Scholar citations, h-index 3, i10-index 1, and 6 publications under review or published in peer-reviewed international venues.
  • 37K
    37,000+ transportation records — Built ensemble ML pipelines and deployed reproducible AWS and Azure Databricks workflows for large-scale experimentation at the Transportation Systems Research Lab.

Education
Ph.D., Computer Science & Engineering — Wright State University, Dayton, OH 01/2026 – Present
M.S., Computer Science & Engineering (AI Track) — University of Toledo, Toledo, OH 08/2023 – 08/2025 · GPA 3.91/4.00
B.Tech., Electronics & Communication Engineering — NIT Srinagar, India 07/2009 – 07/2013

Availability & Work Authorization

Status: Actively seeking full-time opportunities · Available immediately
Work Authorization: Authorized to work in the United States
Location: Ohio / Michigan, USA · Open to relocation nationwide
Work Model: Open to remote, hybrid, or on-site positions