At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come. Join Roche, where every voice matters.
Machine Learning Engineer - Scientific Solutions
Generative AI and Machine Learning are foundational capabilities and key enablers for the overall digital transformation at Roche. We are seeking a talented and motivated Machine Learning Engineer Expert who will join us in Roche Informatics for a journey to drive transformation with digital, data and AI. This candidate will support scientific solutions design and development. Who needs to have a solid background in machine learning, data processing, and software engineering, with experience in handling high-performance systems and optimizing AI capabilities for Roche business needs.
Responsibilities
You will work on various aspects of GenAI, Machine learning engineering, AI Architecture including but not limited to:
Develop and implement robust Machine Learning and Artificial Intelligence solution architectures, translating prototype designs into detailed engineering specifications and ensuring execution.
Developing, customizing, and deploying MLOps services like Vertex AI, SageMaker, Kubeflow
Prototyping and developing cloud-native architecture solutions for application needs, particularly with AWS
Providing infrastructure-as-code utilizing Terraform and AWS Cloud Formation
Perform automation, testing, performance tuning, and tools development.
Provisioning and maintaining cloud infrastructure that will support training machine learning model
Models operations for accuracy, efficiency, and scalability, ensuring they meet performance requirements.
Maintain clear and comprehensive documentation of models, experiments, and processes.
Stay updated with the latest trends in AI and machine learning to influence solution improvement and innovations
Desirable at least 5 years of administering Linux HPC environments. Master’s degree or above in High-Performance Computing, Computer Science, Machine Learning, or a related field, skilling in below categories:
Deep understanding of common AI frameworks: Pytorch, TensorFlow, Jax, etc
Deep understanding of common LLM influence frameworks: vLLM, Tensor RT, ollama, MLC, etc.
Deep understanding of new AI use case components: Embedding, RAG, MCP, etc.
Demonstrated experience handling and processing biomedical data, including but not limited to clinical trial data, genomic sequences, proteomic data, or high-dimensional imaging datasets
Configure and integrate various MLOps application components such as model lifecycle management, model serving, hyperparameter tuning, object storage, load balancers, authentication, etc. (e.g. Kubeflow, mlflow, knative, katib, minio, istio, dex, oidc authservice)
Understanding of the ML workflow, and how ML pipelines automate the workflow (data preprocessing, model training, model evaluation, hyperparameter tuning, model serving, model registries, etc.)
Build and test ML pipelines
Develop custom container images optimized for ML experimentation
Develop and deploy customized Kubernetes clusters for MLOps services like Kubeflow
Good understanding of database technologies
Expertise in data science IDEs like Rstudio or Jupyter notebook
Extensive experience with Kubernetes and Docker
Industry experience with Amazon Web Services (AWS) services, including IAM, VPC, API Gateway, NLB, ALB, EC2, ECS, EKS, Lambda, S3, RDS, DynamoDB, and SQS
Strong knowledge of Linux systems
Proficiency in Python and Bash scripting
Experience implementing CI/CD/CT pipelines and deployment automation using CICD tools and Infrastructure-as-Code (IaC)
Good understanding of networking and related protocols (HTTP, DNS, TLS, TCP)
Understanding of cloud provisioning tools like CloudFormation and Terraform
Expertise with DevOps tools such as Gitlab, Terraform, Ansible, and Chef
Exposure to messaging pub/sub systems (e.g., AWS SNS, SQS, RedisQ)
Excellent problem-solving abilities and attention to detail.
Ability to work collaboratively in a team environment and engage with different workstreams.
A focus on solution delivery, with excellent time management and organization skills
Independent, self-driven person with a high sense of accountability.
Must be able to communicate fluently in English & Mandarin, both in written and verbal.
A healthier future drives us to innovate. Together, more than 100’000 employees across the globe are dedicated to advance science, ensuring everyone has access to healthcare today and for generations to come. Our efforts result in more than 26 million people treated with our medicines and over 30 billion tests conducted using our Diagnostics products. We empower each other to explore new possibilities, foster creativity, and keep our ambitions high, so we can deliver life-changing healthcare solutions that make a global impact.
Let’s build a healthier future, together.
Roche is an Equal Opportunity Employer.