In this role, the candidate will be required to understand Deep learning workload characteristics and have the hands-on ability to measure, analyze and use the data to project and estimate the power and performance of the latest DL workloads. 

Responsibilities

  • The ideal candidate will have both software and hardware background to do sensitivity analysis for both hardware knobs and understand how to measure and improve the performance of DL workloads.
  • The candidate should have worked on simulators and have experience with benchmarking DL models.
  • The ideal candidate should have at least 5+ years of experience working on performance analysis of DL workloads running workloads on accelerators and improving them.
  • Programming and debugging code written in python/C++/CUDA/HIP/OpenCL will be required as well as ability to model and work with the hardware teams to measure power and performance of key kernels running on RTL and performance simulators
  • Knowledge of performance and power modeling is a plus.
  • Solid understanding of the fundamentals of computer architecture, memory hierarchy, caches and fabrics is a prerequisite for the role. 

Requirements

  • Excellent skills in problem solving, written and verbal communication, excellent organization skills, and highly self-motivated.
  • Ability to work well in a team and be productive under aggressive schedules

Education and Experience

  • PhD, Master’s Degree in Computer Engineering / Computer science with 5+ years of experience working on DL models.
  • Coursework on computer architecture, parallel computing , compilers and digital design is required.

Location

(US) Santa Clara CA , Austin TX, PORtland OR FORt Collins CO

Job Overview
Job Posted:
2 days ago
Job Expires:
Job Type
Full Time

Share This Job: