23andMe Sunnyvale, CA
AI/ML Solutions Architect & Full Stack Developer Sep 2023 - March 2025
- Architected and deployed scalable machine learning solutions in AWS and GCP, leading the rollout of a genetic risk prediction platform that processed over 40 million inference requests annually with 99% uptime.
- Developed and productionized deep learning models (TensorFlow, PyTorch, Scikit-learn, Vertex AI, Hugging Face) for personalized disease risk scoring, NLP report generation, and ancestry clustering, increasing predictive accuracy by 18% and reducing model inference time by 43%.
- Integrated LLMs and generative AI (OpenAI, Vertex AI, HuggingFace) into self-service tools, voice AI chatbots, and customer support, decreasing support ticket volume by 27% and maintaining CSAT more than 4.8/5.
- Designed and managed Kubernetes-based hybrid-cloud deployments (AWS, GCP, on-premise), meeting strict HIPAA/GDPR compliance and supporting international research collaborations.
- Built and maintained RESTful and GraphQL APIs (FastAPI, Node.js) supporting real-time health analytics for 14M+ users, with peak loads of 10,000+ concurrent sessions.
- Engineered big data and ETL pipelines (Python, Pandas, AWS Glue, MongoDB, PostgreSQL, BigQuery) for genomics analytics, enabling insights on datasets exceeding 8 petabytes and reducing data latency by 55%.
- Automated MLOps workflows using CI/CD (GitHub Actions, CircleCI, Docker, Terraform, Kubernetes), reducing deployment times to under 12 minutes and supporting 30+ releases/month.
- Drove cost optimization and performance tuning for cloud resources, lowering compute spend by 17% while implementing automated model retraining and model registry solutions.
- Collaborated with pre-sales, business analysis, and solution consulting teams to deliver ML-driven prototypes, resulting in three enterprise B2B contracts and expanding API revenue streams by $2.3M/year.
- Developed comprehensive test suites (unit, integration, E2E Pytest, Jest, Selenium), ensuring 98%+ test coverage and robust system reliability.
- Authored and maintained technical documentation (data flows, API specs, MLOps playbooks), supporting cross-team onboarding, internal training, and external partner integrations.
- Mentored and trained 6+ junior engineers, established best practices for model monitoring, code reviews, Agile processes, and contributed to multi-functional product delivery teams.
Civis Analytics Chicago, IL
Machine Learning Engineer Aug 2021 - Feb 2023
- Specialized in Computer Vision (CV) and Large Language Models (LLM), pioneering early adoption of RAG pipelines, document OCR, and LLM-based document QA for civic data and survey analytics.
- Engineered and deployed scalable ML models (Python, scikit-learn, XGBoost, LightGBM) to support projects in voter turnout, public health forecasting, and segmentation for 25+ organizations.
- Piloted Retrieval-Augmented Generation (RAG) systems using Hugging Face Transformers, FAISS, and LangChain, delivering context-enriched generative responses for knowledge management and survey data.
- Developed and integrated computer vision workflows (OpenCV, PyTorch, TensorFlow) for automated document OCR, form recognition, and visual data validation, improving data quality and automation rates.
- Built robust NLP pipelines (spaCy, Hugging Face, NLTK) to extract actionable insights from 10M+ records, powering open-text analysis and voice-of-customer initiatives.
- Designed and automated ETL pipelines (Airflow, AWS S3, Pandas, PostgreSQL), accelerating data onboarding by 60% and scaling analytics to multi-terabyte datasets.
- Developed production APIs and microservices (Flask, FastAPI, Docker, AWS), enabling real-time delivery of ML predictions with 99% uptime and 5,000+ hourly requests.
- Implemented MLOps and CI/CD best practices (GitHub Actions, Docker, Terraform), streamlining model retraining, versioning, and deployment across cloud and on-premise platforms.
- Audited and improved model fairness using Fairlearn and SHAP, delivering compliance and explainability reports, and achieving a 17% increase in fairness metrics.
- Mentored junior engineers and analysts in modern ML, CV, LLM, and MLOps techniques, fostering a collaborative and high-performance engineering culture.
Broadridge Financial Solutions Lake Success, NY
Data Engineer June 2019 - Feb 2021
- Engineered and maintained cloud-based data lakes (AWS S3, Glue, EMR, Spark, Python, SQL) to process and store billions of daily securities transactions from global markets, reducing analytics processing time by 40%.
- Designed and automated ETL pipelines (Informatica, Python, Shellscripting, Oracle, SQL Server) for extraction, transformation, and loading of financial data, improving data reliability and pipeline efficiency.
- Built real-time data integration and streaming solutions (Kafka, Spark Streaming, AWS Lambda) to power dashboards and reconciliation platforms for trade settlements and risk monitoring.
- Developed regulatory reporting automation workflows (Informatica, AWS, Python, Oracle) to generate and submit MiFID II, SEC, and Dodd-Frank reports, ensuring compliance and reducing manual intervention by 85%.
- Led client onboarding and large-scale data migration projects (Talend, Python, REST APIs, SQL), mapping and validating historical trading and reference data for new institutional clients.
- Optimized legacy batch jobs and data models (SQL, Control-M, Python) for risk and compliance analytics, shrinking the overnight processing window from 7 to 3 hours.
- Implemented robust data quality checks and lineage tracking (Informatica, Collibra, SQL), automating validation and improving regulatory and internal audit readiness.
- Collaborated with cross-functional teams (business analysts, QA, DevOps, product managers) to deliver data-driven solutions, enhance governance, and support key financial product launches.