Member of Technical Staff, GPU Performance Engineering

at Inflection

Full Time

Inflection AI is a public benefit corporation leveraging our world class large language model to build the first AI platform focused on the needs of the enterprise.

Who we are:

Inflection AI was re-founded in March of 2024 and our leadership team has assembled a team of kind, innovative, and collaborative individuals focused on building enterprise AI solutions. We are an organization passionate about what we are building, enjoy working together and strive to hire people with diverse backgrounds and experience.

Our first product, Pi, provides an empathetic and conversational chatbot. Pi is a public instance of building from our 350B+ frontier model with our sophisticated fine-tuning (10M+ examples), inference, and orchestration platform. We are now focusing on building new systems that directly support the needs of enterprise customers using this same approach.

Want to work with us? Have questions? Learn more below.

About the Role

As a Member of Technical Staff on our GPU Performance Engineering team, you will play a critical role in optimizing the overall performance and efficiency of our AI systems. Instead of a narrow focus on compilers or GPU kernels, you'll work across the higher levels of the stack—bridging system-level optimizations with application-level orchestration—to ensure our platforms consistently deliver enterprise-grade speed, reliability, and scalability.

This is a good role for you if you:

Have experience in performance optimization across multiple layers of the technology stack, from system-level programming to high-level application integration.
Are proficient in languages like C/C++ and comfortable working in higher-level environments with modern frameworks.
Possess familiarity with GPU programming (e.g., CUDA) and acceleration techniques, while also understanding the broader aspects of system performance.
Thrive in dynamic, fast-paced environments where pushing technology forward is part of your DNA.
Enjoy collaborating with ML researchers and engineers to identify bottlenecks and implement holistic solutions that drive innovation across our AI platform.

Responsibilities include:

Designing and implementing performance enhancements that span the entire AI stack, from core systems to orchestration layers.
Collaborating closely with cross-functional teams to identify performance bottlenecks and develop strategic solutions to maximize hardware utilization and efficiency.
Integrating emerging technologies and best practices to continuously improve the performance and scalability of our production systems.
Leading initiatives to streamline and optimize the deployment pipelines that support our AI solutions in enterprise environments.
Serving as a key technical leader, influencing the evolution of our technology stack and ensuring our systems remain at the forefront of performance innovation.

Employee Pay Disclosures

At Inflection AI, we aim to attract and retain the best employees and compensate them in a way that appropriately and fairly values their individual contributions to the company. For this role, Inflection AI estimates a starting annual base salary will fall in the range of approximately $175,000 - $350,000 depending on experience. This estimate can vary based on the factors described above, so the actual starting annual base salary may be above or below this range.

Benefits

Inflection AI values and supports our team’s mental and physical health. We are focused on building a positive, safe, inclusive and inspiring place to work. Our benefits include:

Diverse medical, dental and vision options
401k matching program
Unlimited paid time off
Parental leave and flexibility for all parents and caregivers
Support of country-specific visa needs for international employees living in the Bay Area

Interview Process

Apply: Please apply on Linkedin or our website for a specific role.

After speaking with one of our recruiters, you’ll enter our structured interview process, which includes the following stages:

Hiring Manager Conversation – An initial discussion with the hiring manager to assess fit and alignment.
Technical Interview – A deep dive with an Inflection Engineer to evaluate your technical expertise.
Onsite Interview – A comprehensive assessment, including:

A domain-specific interview
A system design interview
A final conversation with the hiring manager

Depending on the role, we may also ask you to complete a take-home exercise or deliver a presentation.

For non-technical roles, be prepared for a role-specific interview, such as a portfolio review.

Decision Timeline
We aim to provide feedback within one week of your final interview.

Salary

$175,000 - $350,000

Yearly based

Location

Palo Alto, CA

Engineer

Job Overview

Job Posted:

3 months ago

Job Expires:

Job Type

Full Time

Inflection AI is a public benefit corporation leveraging our world class large language model to build the first AI platform focused on the needs of the enterprise.

Who we are:

About the Role

Employee Pay Disclosures

Benefits

Interview Process

Salary

$175,000 - $350,000

Location

Share This Job:

AI Jobs

Companies

Support

Job Details

Inflection AI is a public benefit corporation leveraging our world class large language model to build the first AI platform focused on the needs of the enterprise.

Who we are:

About the Role

Employee Pay Disclosures

Benefits

Interview Process

Salary

$175,000 - $350,000

Location

Share This Job:

Related Jobs

AI Jobs

Companies

Support