Blend is a premier AI services provider, committed to co-creating meaningful impact for its clients through the power of data science, AI, technology, and people. With a mission to fuel bold visions, Blend tackles significant challenges by seamlessly aligning human expertise with artificial intelligence. The company is dedicated to unlocking value and fostering innovation for its clients by harnessing world-class people and data-driven strategy. We believe that the power of people and AI can have a meaningful impact on your world, creating more fulfilling work and projects for our people and clients. For more information, visit www.blend360.com
We are seeking a highly skilled ML/MLOps Manager with an overall experience 8 years with 3 years as ML Engineer particularly in building and managing ML pipelines using MLFlow or CML (Cloudera Machine Learning). The ideal candidate has successfully built and deployed at least two MLOps projects using MLFlow or similar services, with a strong foundation in infrastructure as code and a keen understanding of MLOps best practices.
Key Responsibilities
Maintain and enhance existing ML pipelines in On Premise with a focus on infrastructure as code.
Implement minimal but essential pipeline extensions to support ongoing data science workstreams.
Convert the Data Science notebooks into production ready deployable components.
Build ML pipelines for training, inference, monitoring.
Document infrastructure usage, architecture, and design using tools like Confluence, GitHub Wikis, and system diagrams.
Act as the internal infrastructure expert, collaborating with data scientists to guide and support ML model deployments.
Research and implement optimization strategies for ML workflows and infrastructure.
Work independently and collaboratively with cross-functional teams to support ML product
Key Responsibilities
Lead the design, development, and management of robust ML pipelines and infrastructure in on-premises or private cloud environments.
Define and drive MLOps strategy and best practices for model deployment, monitoring, and lifecycle management.
Oversee the implementation and governance of Infrastructure as Code (IaC) using tools like Ansible, Terraform (for private cloud), or Puppet.
Manage, mentor, and guide MLOps engineers, fostering a high-performing and collaborative team.
Collaborate with cross-functional teams to align MLOps solutions with business and data science objectives.
Drive automation and standardization of CI/CD pipelines, model versioning, and container orchestration (e.g., Docker, Kubernetes, OpenShift).
Ensure comprehensive documentation of infrastructure, architecture, and operational workflows using tools like Confluence, GitHub Wikis, and system diagrams.
Identify and implement optimization opportunities for ML infrastructure performance, cost, and scalability.
Stay updated on industry trends and emerging technologies to continuously enhance MLOps capabilities.
8+ years of hands-on MLOps experience with Git Actions, Jenkins or any equivalent tools.
Strong knowledge of ML workflow, MLOps concepts like model governance, model monitoring, data drift, retraining…etc.
Proven experience with at least two MLOps projects deployed using ML Flow or CML.
Strong proficiency in Services like: Bitbucket, Git, Git actions, Jenkins, Airflow, CML
Expertise in Infrastructure as Code using CloudFormation for dev/test/prod environments.
Solid understanding of MLOps best practices and Data Science principles.
Proficient in Python for scripting and automation.
Experience in converting the Data Science notebooks into production ready deployable components.
Proven experience in building ML pipelines for training, inference, monitoring.
Experience building and managing Docker images.
Hands-on experience with Git-based version control systems such as Bitbucket, GitHub, including GitHub Actions for CI/CD pipelines, Jenkins or similar tools is a plus.