Cerebras is on a mission to accelerate the pace of progress in Generative AI by building AI supercomputers that deliver unprecedented performance for LLM training! Cerebras is leveraging these supercomputers to turbocharge the exploration of end-to-end solutions that address real-world challenges, such as breaking down language barriers, enhancing developer productivity, and advancing medical research breakthroughs. The AppliedML team at Cerebras is a team of Generative AI practitioners and experts who leverage Cerebras AI supercomputers to push the technical frontiers of the domain and work with our partners to build compelling solutions. Some of this team's publicly announced successful efforts are BTLM, Jais 30B multilingual model, and Arabic chatbot, among others.

About the role 

As an applied machine learning engineer, you will work on adapting state of the art deep learning (DL) models to run on our wafer scale system. This includes both functional validation and performance tuning of a variety of core models for applications like Natural Language Processing (NLP), Large Language Models (LLMs), Computer Vision (CV) and Graph Neural Networks (GNN).

As a member of the Cerebras engineering team you will be implementing models in popular DL frameworks like PyTorch and using insights into our hardware architecture to unlock to full potential of our chip.  You will work on all aspects of the DL model pipeline including: 

  • Dataloader implementation and performance optimization 
  • Reference model implementation and functional validation 
  • Model convergence and hyper-parameters tuning 
  • Model customization to meet customer needs. 
  • Model architecture pathfinding.

This role will allow you to work closely with partner companies at the forefront of their fields across many industries. You will get to see how deep learning is being applied to some of the world’s most difficult problems today and help ML researchers in these fields to innovate more rapidly and in ways that are not currently possible on other hardware systems. 

Responsibilities

  • Analyze, implement, and optimize DL models for the WSE 
  • Functional and convergence of models on the WSE 
  • Work with engineering teams to optimize models for the Cerebras stack 
  • Support engineering teams in functional and performance scoping new models and layers 
  • Work with customers to optimize their models for the Cerebras stack.
  • Develop new approaches for solving real world AI problems on various domains.

Requirements

  • Master's degree or PhD in engineering, science, or related field with 5+ years of experience
  • Experience programming in modern language like Python or C++ 
  • In-depth understanding of DL learning methods and model architectures 
  • Experience with DL frameworks like PyTorch, TensorFlow and JAX
  • Familiar with state-of-the-art transformer architectures for language and vision model.
  • Experience in model training and hyper-parameter tuning techniques.
  • Familiar with different LLM downstream tasks and datasets.

Preferred Skills

  • A deep passion for cutting edge artificial intelligence techniques 
  • Understanding of hardware architecture 
  • Experience programming accelerators like GPUs and FPGAs  

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Location

Bengaluru, Karnataka, India

Job Overview
Job Posted:
4 weeks ago
Job Expires:
Job Type
Full Time

Share This Job: