Our team develops Next Generation Sequencing (NGS) solutions used by researchers and clinicians worldwide, providing sample-to-answer pipelines with high reliability, speed, and accuracy of results. We develop machine learning solutions across Illumina’s portfolio, from sequencing functions to analysis and interpretation algorithms. DRAGEN, our secondary analysis platform, has industry leading performance and is used for clinical and research work. We also develop algorithms for on-sequencer pipelines including super-resolution, basecalling, denoising. Advanced AI applications drive transformational genetic insights that improve understanding of human biology, cancer and rare disease.
We are seeking an ML Ops engineer to join our team. This role will develop, implement, and optimize data pipelines for ML systems across Illumina’s products, including DRAGEN and high-throughput sequencing systems like Novaseq X, the highest throughput sequencer in the industry. You will collaborate with cross-functional teams (ML, implementation, bioinformatics, optics and imaging, test) to store and process petabytes of highly heterogenous data (images, sequencing output, population data, truth sets, DNA, RNA, multi-omics, variant calls).
Responsibilities:
You will create and maintain code, documentation, testing and deployment frameworks, tools and infrastructure, working closely with engineers, researchers and domain experts on AI/ML models and pipelines
Work with experts across software engineering, hardware engineering, ML and data science, optics and imaging, precision motion, embedded systems, test
Develop environments for building, testing, tracking production AI models and data across data pipelines used in primary and secondary genomic analysis
Benchmark, track and document model performance, enable continuous improvement of pipelines
Be a technical expert to help internal customers & teams to develop AI models within a consistent ML environment, automate training and data/model management
Participate in setting the long-term roadmap for technical solutions using ML across multiple pipelines
Stay up to date on best practices and drive adoption of standardized processes across the team and wider organization
Standardize the management of ML models, operationalize ML pipelines; support release, activation, monitoring
All listed tasks and responsibilities are deemed as essential functions to this position; however, business conditions may require reasonable accommodations for additional tasks and responsibilities.
Qualifications:
Bachelors or Masters in Computer Science or a related technical field, or equivalent experience
2+ years of relevant experience in machine learning and operations, ideally in a hands-on MLOps role (extraordinary applicants with less experience also considered)
Experience deploying APIs and packages
Experience with ML ops platforms (MLFlow, W&B etc; Kubernetes/Docker, dask, rapids.ai, Ray, similar)
Experience with ML frameworks (Tensorflow, keras, Pytorch, xgboost, sklearn)
Strong Python coding skills – experience with unit testing, code reviews, version control
Strong background and interest in ML Ops, Dev Ops, Data Engineering
Self-starter, good problem-solving skills, ability to push forward project objectives both through individual effort and team collaboration
Experience with CI/CD platforms, ability to design a technical roadmap and influence/build alignment
All listed requirements are deemed as essential functions to this position; however, business conditions may require reasonable accommodations for additional tasks and responsibilities.
Additional Nice-to-Haves:
Bioinformatics, ML, software engineering principles, software test, applied math background and/or experience
NGS knowledge
Distributed compute for big data - HPC, dask, Ray etc.
Experience with AWS, GCP, Azure
Visualization experience (plotly, matplotlib, etc)
Familiarity with bioinformatics workflows primary and/or secondary analysis pipelines
Experience with revision control (git, github Actions)
Experience with ML acceleration technology (FPGA, GPU, etc)
Strong Linux/Unix fundamentals
Strong documentation and presentation skills
Machine learning experience/knowledge
Degree and Job Experience Requirements:
The candidate could have a degree from any of the following fields: Bioinformatics, Biology, Physics, Electrical Engineering, Computer Science, Software Engineering, Applied Math, related topics
Bachelor’s, Master’s, or Ph.D.
Job experience: the role can be morphed to accommodate candidates from recent graduates to experienced professionals
Singapore - Woodlands - NorthCoast