The Machine Learning Operations Engineer supports our machine learning infrastructure by ensuring seamless model training, optimization, and deployment. This role is perfect for a tech-savvy individual who enjoys managing machine learning systems and hardware configurations rather than focusing solely on programming, although coding experience would be a strong plus. The ideal candidate is a computer enthusiast with a knack for machine learning infrastructure and model optimization with a passion for working in a collaborative, fast-paced environment.

Responsibilities

  • Maintain and manage the software configuration of on-premises machine learning hardware to support optimal performance for training neural networks.  
  • Set up and maintain cloud-based training environments, primarily on Google Cloud Platform, to facilitate model experimentation and scalability.  
  • Automate training workflows to drive continuous improvement of vision models, reducing manual overhead and enhancing efficiency.  
  • Develop automated accuracy assessments and generate reports to evaluate and compare the performance of newly trained neural networks against existing models.  
  • Ensure predictable and efficient turnaround times for training models with updated datasets to meet project timelines.  
  • Organize and manage model weights and associated documentation in various formats for deployment across on-premises, cloud, and edge environments.  
  • Apply quantization and pruning techniques to models to enhance computational efficiency without sacrificing accuracy.  
  • Design and deploy infrastructure for low-latency inference to enable real-time performance for large-scale models (e.g., vLLMs).

Requirements

  • Proven experience with Linux server maintenance, including both on-premises and cloud environments.  
  • Proficient in scripting with Bash and Python to streamline system and model management.  
  • Hands-on experience with neural network training, data loaders, and data pre-processing pipelines.  
  • Familiar with data and model parallelism strategies for improving training speed and efficiency.  
  • Knowledgeable in neural network model conversion and optimization for deployment on diverse hardware.

Preferred Qualifications

  • Familiarity with Google Cloud Platform for machine learning operations.  
  • Experience with specialized hardware platforms such as Nvidia Jetson, Triton Inference Server, and NIM.  
  • Skilled in OpenVINO and ONNX for model conversion and optimization.  
  • Experience training or fine-tuning large language models (LLMs) would be a significant advantage.  
  • Programming experience in Python and C++ is beneficial but not mandatory.  
  • Strong written and verbal communication skills for documentation and collaboration.  
  • Passion for machine learning technology and an aptitude for problem-solving in fast-paced environments.
At Simbe, you will be at the forefront of retail innovation, working with cutting-edge AI and robotics technologies to transform retail operations. Our culture is dynamic, inclusive, and driven by a passion for improving the way retailers operate and serve their customers. Join us to be a part of a team that is not only reshaping the future of retail but also offering immense value to our clients worldwide.
Simbe Values: R. E. T. A. I. L.Result Driven - We are customer-centric and results-driven. We strive to create immense value for our team, partners, customers, and investors. Empathetic - We are sensitive and mindful. We support each other in challenging times, both professionally and personally.Transparent - We highly value open communication internally, and with our partners and customers. We are receptive to feedback.Agile - We are agile and always eager to learn. We quickly adapt to changes and customer needs.Innovative - We are bold and innovative, with an intense focus on product design and user experience.Leaders - We strive for excellence. We are accountable, the best at what we do, and leaders in our field.

Remote Job

Job Overview
Job Posted:
1 week ago
Job Expires:
Job Type
Full Time

Share This Job: