Job Summary:

We are seeking a highly experienced and visionary Databricks Data Architect with over 14 years in data engineering and architecture, including deep hands-on experience in designing and scaling Lakehouse architectures using Databricks. The ideal candidate will possess deep expertise across data modeling, data governance, real-time and batch processing, and cloud-native analytics using the Databricks platform. You will lead the strategy, design, and implementation of modern data architecture to drive enterprise-wide data initiatives and maximize the value from the Databricks platform.

 

Key Responsibilities:

  • Lead the architecture, design, and implementation of scalable and secure Lakehouse solutions using Databricks and Delta Lake.
  • Define and implement data modeling best practices, including medallion architecture (bronze/silver/gold layers).
  • Champion data quality and governance frameworks leveraging Databricks Unity Catalog for metadata, lineage, access control, and auditing.
  • Architect real-time and batch data ingestion pipelines using Apache Spark Structured Streaming, Auto Loader, and Delta Live Tables (DLT).
  • Develop reusable templates, workflows, and libraries for data ingestion, transformation, and consumption across various domains.
  • Collaborate with enterprise data governance and security teams to ensure compliance with regulatory and organizational data standards.
  • Promote self-service analytics and data democratization by enabling business users through Databricks SQL and Power BI/Tableau integrations.
  • Partner with Data Scientists and ML Engineers to enable ML workflows using MLflow, Feature Store, and Databricks Model Serving.
  • Provide architectural leadership for enterprise data platforms, including performance optimization, cost governance, and CI/CD automation in Databricks.
  • Define and drive the adoption of DevOps/MLOps best practices on Databricks using Databricks Repos, Git, Jobs, and Terraform.
  • Mentor and lead engineering teams on modern data platform practices, Spark performance tuning, and efficient Delta Lake optimizations (Z-ordering, OPTIMIZE, VACUUM, etc.).

 

Technical Skills:

  • 10+ years in Data Warehousing, Data Architecture, and Enterprise ETL design.
  • 5+ years hands-on experience with Databricks on Azure/AWS/GCP, including advanced Apache Spark and Delta Lake.
  • Strong command of SQL, PySpark, and Spark SQL for large-scale data transformation.
  • Proficiency with Databricks Unity Catalog, Delta Live Tables, Autoloader, DBFS, Jobs, and Workflows.
  • Hands-on experience with Databricks SQL and integration with BI tools (Power BI, Tableau, etc.).
  • Experience implementing CI/CD on Databricks, using tools like  Git, Azure DevOps, Terraform, and Databricks Repos.
  • Proficient with streaming architecture using Spark Structured Streaming, Kafka, or Event Hubs/Kinesis.
  • Understanding of ML lifecycle management with MLflow, and experience in deploying MLOps solutions on Databricks.
  • Familiarity with cloud object stores (e.g., AWS S3, Azure Data Lake Gen2) and data lake architectures.
  • Exposure to data cataloging and metadata management using Unity Catalog or third-party tools.
  • Knowledge of orchestration tools like Airflow, Databricks Workflows, or Azure Data Factory.
  • Experience with Docker/Kubernetes for containerization (optional, for cross-platform knowledge).

 

Preferred Certifications (a plus):

  • Databricks Certified Data Engineer Associate/Professional
  • Databricks Certified Lakehouse Architect
  • Microsoft Certified: Azure Data Engineer / Azure Solutions Architect
  • AWS Certified Data Analytics – Specialty
  • Google Professional Data Engineer

Location

Kochi, IN

Job Overview
Job Posted:
2 days ago
Job Expires:
Job Type
Full Time

Share This Job: