NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people.
Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.
Join NVIDIA, where we are pushing the boundaries of what's possible in AI and cloud computing. As a versatile System Software Engineer - AI and Cloud, you will be part of a team of dedicated professionals that thrives on innovation and collaboration. Located in the heart of Silicon Valley, you will have the opportunity to work on groundbreaking projects that craft the future of technology. This role offers an outstanding chance to engage with advanced AI models and cloud-native architectures, making significant contributions to NVIDIA’s versatile products and technologies.
What you'll be doing:
Evaluate cloud-native, full-stack applications using microservices architecture to power AI use cases, bringing to bear NVIDIA frameworks, SDKs, and microservices.
Design and implement agentic workflows with advanced techniques like Retrieval-Augmented Generation (RAG) and the latest AI models.
Evaluate user experiences and analyze the technical performance of AI solutions, compiling findings into comprehensive reports. Offer practical suggestions for product improvement to senior executives and engineering management.
Engage with various teams across NVIDIA such as product, marketing, hardware, software engineering, and QA to improve NVIDIA's product offerings.
Develop developer-focused content, including detailed tutorials and code samples, to demonstrate the latest features in NVIDIA’s tools and libraries.
Write technical whitepapers and product briefs, and run technical demos of our products at prominent industry conferences.
What we need to see:
A Bachelor’s or Master’s in Software Engineering, Computer Science, Computer Engineering, Electrical Engineering or a related degree (or equivalent experience)
3+ years of experience.
Proficiency in Python and JavaScript for programming and debugging, with a strong foundation in data structures, algorithms, and software design principles.
Basic familiarity with C++ programming and its application in high-performance computing environments.
Experience in crafting cloud-native systems optimized for Kubernetes deployment, using inference frameworks such as vLLM and NVIDIA Triton Inference Server.
A solid understanding of API design principles for building scalable, production-ready inference systems.
Ways to stand out from the crowd:
Advanced knowledge of LLMs, modern AI software architecture, and cloud APIs.
Contributions to public-facing technical content and open-source projects.
Expertise in deploying LLM inference frameworks like Triton Inference Server, vLLM, or TensorRT, including on Kubernetes or edge devices to improve performance.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.Yearly based
US, CA, Santa Clara