Job Description Summary

Designs, develops, tests, debugs and implements more complex operating systems components, software tools, and utilities with full competency. Coordinates with users to determine requirements. Reviews systems under development and related documentation. Makes more complex modifications to existing software to fit specialized needs and configurations, and maintains program libraries and technical documentation. May coordinate activities of the project team and assist in monitoring project schedules and costs.

Responsibilities 

  • Design and implement scalable data pipelines for both ML and non-ML applications 

  • Build and maintain data lakes and feature stores preferably optimized for machine learning 

  • Develop ETL processes for complex, high-volume datasets 

  • Create and maintain infrastructure for ML model training and deployment 

  • Collaborate with data scientists to productionize ML models 

  • Implement CI/CD pipelines for ML models  

  • Optimize data processing for model training and inference 

  • Monitor data ystems performance and troubleshoot issues 

  • Ensure data quality, integrity, and governance  

  • Design real-time data processing solutions for ML applications and other consumer applications 

Requirements 

  • Bachelor's or master's degree in computer science, Engineering, or related technical field 

  • Minimum of 5 years' experience in building data pipelines for both structured and unstructured data. 

  • At least 2 years' experience in Azure data pipeline development. 

  • Preferably 3 or more years' experience with Hadoop, Azure Databricks, Stream Analytics, Eventhub, Kafka, and Flink. 

  • Strong proficiency in Python and SQL 

  • Experience with big data technologies (Spark, Hadoop, Kafka) 

  • Familiarity with ML frameworks (TensorFlow, PyTorch, scikit-learn) 

  • Knowledge of model serving technologies (TensorFlow Serving, MLflow, KubeFlow) will be a plus 

  • Experience with one pof the cloud platforms (Azure preferred) and their Data Services. Understanding ML services will get preference. 

  • Understanding of containerization and orchestration (Docker, Kubernetes) 

  • Experience with data versioning and ML experiment tracking will be great addition 

  • Knowledge of distributed computing principles 

  • Familiarity with DevOps practices and CI/CD pipelines 
     

  • Preferred Qualifications

  • Bachelor’s degree in Computer Science or equivalent practical experience.  

  • Experience with Agile/Scrum methodologies.  

  • Background in tax and accounting domains is advantageous 

  • Azure Data Engineer certification is beneficial. 

Location

IND-Pune-IndiQube Orchid, India

Job Overview
Job Posted:
1 week ago
Job Expires:
Job Type
Full Time

Share This Job: