Overview
The opportunity
Looking for a performance-driven, focused and committed individual with a proven track record for the role of Engineer-Site Reliability for Presight.
The Company
Presight is an ADX-listed public company with Abu Dhabi based G42 as its majority shareholder and is the region’s leading big data analytics company powered by GenAI. It combines big data, analytics, and AI expertise to serve every sector, of every scale, to create business and positive societal impact. Presight excels at all-source data interpretation to support insight-driven decision-making that shapes policy and creates safer, healthier, happier, and more sustainable societies. Today, through its range of GenAI-driven products and solutions, Presight is bringing Applied AI to the private and public sector, enabling them to realize their AI strategy and ambitions faster.
Responsibilities
- Provide operational support for Azure public cloud and on-premises environments, ensuring systems are running efficiently and securely
- Oversee the deployment and management of Kubernetes clusters, utilizing Helm, ArgoCD for continuous delivery to ensure seamless and rollback-capable deployments.
- Develop and maintain automation scripts using Python and Bash, enhancing operational efficiency and reducing manual intervention.
- Monitor system performance to guarantee maximum uptime, applying necessary patches, hotfixes and upgrades as required.
- Manage and troubleshoot systems in isolated environments, ensuring robust and reliable infrastructure and database management.
- Participate in issue triage, escalation and technical support processes, ensuring timely resolution of incidents and minimizing impact on operations.
- Administer and manage Azure or any other cloud environments, ensuring optimal performance, security, and scalability.
- Design, deploy and manage containerized applications using Docker and Kubernetes.
- Implement and maintain CI/CD pipelines using GitLab and Azure DevOps, enabling seamless code integration and deployment.
- Automate infrastructure provisioning and configuration management to enhance system efficiency and reduce manual intervention(Terraform, Bash/Python Scripting, etc).
- Collaborate with the Data team to support and manage Azure Data Factory, Databricks, and Azure Fabric, other cloud data services.
- Manage and maintain on-premises environments, including private cloud (OpenStack or any other), Bare Metal servers and Rancher or any other k8s for container orchestration.
- Deploy and administration for databases like MySQL, PostgreSQL, Oracle and other in isolated environments.
- Ensure adherence to security best practices and compliance requirements across cloud and on-premises environments.
- Monitor system performance and implement proactive measures to enhance security and stability.
- Work closely with cross functional teams to design, develop, and implement DevOps solutions that align with business objectives
- Comply with QHSE (Quality Health Safety and Environment), Business Continuity, Information Security, Privacy, Risk, Compliance Management, and Governance of Organizations policies, procedures, plans, and related risk assessments.
Qualifications
- Bachelor’s degree in Computer Science, Information Technology or a related field.
- Eight (8) years of product administration experience in Linux (RHEL, CentOS, Debian & Ubuntu) based environment.
- Must have a minimum of five (5) years of relevant experience as a Hadoop administrator with an Expert level knowledge of Cloud Operations.
- Proven proficiency in Designing, Developing and maintaining cloud solutions and isolated environments.
- Strong DevOps Mindset with the knowledge of Software development life cycle.
- Must have prior Big data platform using automation tools like Ansible, Helm Charts.
- Must have knowledge of best practices for Data Warehousing including business intelligence, and business continuity planning.
- Hands-on experience with Monitoring platforms Zabbix, Grafana, Working with data delivery teams.
- Responsible for implementation and ongoing administration of applications in Air Gapped infrastructure.
- Competency in Linux administration (security, configuration, tuning, troubleshooting and monitoring) & Basic Networking troubleshooting.
- Private Cloud, Public Cloud & Hybrid Cloud
- Experience working in an Agile environment
- Relevant certifications in cloud platforms (Azure, AWS), Kubernetes, Docker or DevOps tools are a plus.
What we look for:
If you are a performance-driven, inquisitive mind with the agility to adapt to ambiguity, you will fit right in. You should be eager to explore opportunities to build meaningful collaborations with stakeholders and aspire to create unique customer-centric solutions. Bias for action and a passion to conquer new frontiers in the AI space is at the heart of the Presight community.
What working at Presight offers:
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Outstanding learning, development & growth opportunities via structured training programs and innovative, high-tech projects.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.