Senior Distributed Systems Engineer

at Luma AI

Full Time

We are looking for people with strong ML & Distributed systems backgrounds. This role will work within our Research team, closely collaborating with researchers to build the platforms for training our next generation of foundation models.

Responsibilities

Work with researchers to scale up the systems required for our next generation of models trained on multi-thousand GPU clusters.
Profile and optimize our model training code-base to achieve best in class hardware efficiency.
Build systems to distribute work across massive GPU clusters efficiently.
Design and implement methods to robustly train models in the presence of hardware failures.
Build tooling to help us better understand problems in our largest training jobs.

Experience

5+ years of work experience.
Experience working with multi-modal ML pipelines, high performance computing and/or low level systems.
Passion for diving deep into systems implementations and understanding their fundamentals in order to improve their performance and maintainability.
Experience building stable and highly efficient distributed systems.
Strong generalist Python and Software skills including significant experience with Pytorch.
Good to have experience working with high performance C++ or CUDA.
Please note this role is not meant for recent grads.

Compensation

The pay range for this position in California is $180,000 - $250,000yr; however, base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan.

Your application is reviewed by real people.

Salary

$180,000 - $250,000

Yearly based

Location

Palo Alto, California

Engineer

Job Overview

Job Posted:

5 months ago

Job Expires:

Job Type

Full Time

Responsibilities

Experience

Compensation

Salary

$180,000 - $250,000

Location

Share This Job:

AI Jobs

Companies

Support

Job Details

Responsibilities

Experience

Compensation

Salary

$180,000 - $250,000

Location

Share This Job:

Related Jobs

AI Jobs

Companies

Support