Deepmind

Senior Research Engineer for Large Models

Job Description

Posted on: 
January 30, 2023

We are looking to further grow our large scale machine learning expertise to accelerate projects across the breadth of DeepMind's research program. We are increasingly focused on problems that require crafting and building research infrastructure at the largest of scales!

Responsibilities

In this role, our Senior Research Engineers work on cutting-edge research problems at the largest scales—including projects like Chinchilla (a compute-optimal language model), Sparrow (a human-aligned dialog agent using search), Flamingo (a visual language model operating on multiple modalities) and more.

You'll be building machine learning infrastructure used to make these models scalable, performant, robust, and reusable—enabling future research breakthroughs by continuously using upstream pre-trained model artifacts applied in new and novel ways.

In the research domain, we are interested in approaches such as quantisation, sharding regimes, transformer optimisation and others that make large models more performant and less resource intensive. Engineering challenges we solve help build scalable and reusable tech stacks and APIs for model use across multiple research groups.

You'll work alongside world-class research efforts that are constantly pushing the boundaries in the fields of large models, heterogeneous compute, distributed computation on accelerators, and large model RL training—to name a few.

Your colleagues will be Software and Research Engineers with a diverse set of backgrounds working to accelerate DeepMind's mission and research goals. Our team's solid fundamentals across both engineering and research makes us well suited to make the use of large models maximally permissive internally.

Job Requirements

Candidates should have an in-depth knowledge in at least one of the following areas, and will gain familiarity with all of the below through on-the-job learning.

  • Training and using large models (>10 billion parameters)
  • Transformer architecture
  • Using HW accelerators (GPU / TPU)
  • Distributed ML system optimisation

Excellent knowledge of either C++ or Python

In addition, the following would be an advantage:

  • Experience implementing, evaluating, and fine-tuning ML algorithms
  • Knowledge of Reinforcement Learning

Apply now

More job openings