In this role, you will design highly scalable and high performing technology solutions in an Agile work environment and produce and deliver code and/or test cases using your knowledge of software development and Agile practice. You will collaborate closely with business support teams, product managers, security and architecture to assist in resolving critical production issues to help simplify and improve business processes through the latest in technology and automation. You are a technical expert that will lead through the requirements gathering, design, development, deployment, and support phases of a product. You are proficient in at least one core programming languages or packages.
What You'll Do
Senior Data Engineer with expertise in designing and implementing scalable data solutions, including robust data pipelines.
Strong proficiency in ETL processes, MLOps practices for efficient model deployment, and utilizing technologies such as Databricks, DataLake, Vector DB, and Feature Store are essential
Design, optimize, and maintain scalable data pipelines using PySpark (Apache Spark), Python, Databricks, and Delta Lake.
Implement MLOps practices for efficient deployment and monitoring of machine learning models.
Should be able to Develop strategies and tools for detecting and mitigating data drift.
Utilize Vector DB for effective data querying and management.
Establish and manage a Feature Store to centralize and share feature data for machine learning models.
Ensure data integrity and quality throughout all stages of the pipeline.
Collaborate with teams and stakeholders to deliver impactful data solutions.
Demonstrate proficiency in Python programming, PySpark (Apache Spark), data architecture, ETL processes, and cloud platforms (AWS, Azure, GCP
Who You Are
Overall 5+ years experience into Databricks, Delta Lake, PySpark (Apache Spark), MLOps, Data Drift Detection, Vector DB and Feature Store .
Expeience into Designing, Optimizing and Maintianing Data Pipelines
Experinence into Implementaton of MLOps practices for efficient deployment and monitoring of ML models
Should be able to Develop strategies and tools for detecting and mitigating data drift.
Utilize Vector DB for effective data querying and management.