AI Engineer / Software Developer (GEMINI Project)
The GEMINI Project
The GEMINI project, "Digital Twins by Generative Artificial Intelligence to Boost Personalized Medicine and Therapeutic Innovation", led by Humanitas Research Hospital, Humanitas AI Center, and Humanitas University, has been funded by the Ministry of University and Research under the FISA program.
Over the next five years, the goal is to develop a model based on innovative and scalable AI algorithms and real-world data (Digital Twin) that replicates the biological and clinical complexity of human diseases with a high social impact (such as cancers, debilitating chronic diseases, and rare diseases).
The ultimate aim is to provide physicians and patients with a validated and transparent tool to support critical aspects of clinical management, including diagnosis, prognosis, and optimal treatment selection.
Job Description
We are looking for an AI Engineer / Software Developer with expertise in software engineering, machine learning, and deep learning applied to structured and unstructured medical data.
In this role, you will focus on developing AI-driven models to answer clinical questions with evidence-based insights in an R&D environment, leveraging data from electronic health records (EHRs). You will collaborate closely with clinical stakeholders to assess key performance indicators (KPIs), refine modeling strategies, and adjust hypotheses as needed. Additionally, you will contribute to the development of shared libraries for your team and integrate existing ones into the workflow.
You will be responsible for designing, developing, and deploying AI solutions, optimizing existing pipelines, and contributing to innovative research in data synthesis and digital twins using cutting-edge machine learning and deep learning methodologies.
Responsibilities and Main activities
- Research and develop generative models for synthetic data generation and digital twins in healthcare.
- Design, develop, and deploy medical software solutions, code repositories, and end-to-end pipelines using appropriate MLOps tools.
- Develop and maintain the software life-cycle
- Develop pipeline for data engineering
- Implement and maintain machine learning pipelines in production environments.
- Define and support the clinical validation of statistical and ML models applied to real-world healthcare data.
- Perform exploratory data analysis (EDA) and integrate highly fragmented medical datasets.
- Develop innovative algorithms and conduct data analysis to uncover significant patterns and trends in complex datasets.
- Collaborate with international partners in industry and academia on cutting-edge AI and ML research projects.
Skills and Qualifications
- Strong knowledge of machine learning, deep learning, and statistical techniques.
- Understanding of data structures, data modeling, and software architecture.
- Proven experience in developing ML/DL solutions (classification, regression, clustering, generative models) within the healthcare domain.
- Expertise in designing and building structured software and pipelines for data scientists and technical users.
- Proficiency in Python and R.
- Experience with ML/DL frameworks such as TensorFlow, PyTorch, and Keras.
- Knowledge of CI/CD pipelines, Git versioning, containerization (Docker), and MLOps best practices.
- Experience with cloud platforms (GCP, AWS, Azure) and distributed computing.
- Knowledge of databases and data lakes, with SQL proficiency being a plus.
- Ability to design and develop backend API services and simple web applications for data science use cases.
- Experience in Computer Vision (CV) and Natural Language Processing (NLP).
- Familiarity with Generative AI and Large Language Models (LLMs) is a plus.
- Master’s degree (PhD preferred) in a STEM discipline.
- Fluent in English and Italian (written and spoken).
- Strong software development skills, including knowledge of software design patterns, object-oriented programming (OOP), and microservices architecture.
- Proficiency in software engineering principles, including testing, debugging, and version control.
- Experience with data engineering, including ETL pipelines, data warehousing, and real-time data processing.
- Knowledge of big data technologies such as Apache Spark, Kafka, and Hadoop is a plus.
- Familiarity with RESTful APIs, GraphQL, and web development frameworks.
Soft Skills
- Excellent team-working capabilities even with colleagues from different research areas and backgrounds;
- Strong self-motivation, commitment and proactive approach;
- Ability to meet deadlines and work autonomously in rapidly changing environments;
- Curiosity and ability of stepping outside your comfort zone.
Contract and duration
We can offer a fixed-term contract. Contract duration, salary, as well as employment level, will be defined based on candidate's profile.
All candidate data collected from the application shall be processed in accordance with applicable law: Dlgs 198/2006 e dei Dlgs 215/2003 e 216/2003; privacy ex artt. 13 e 14 del Reg. UE 2016/679.