We are looking for a Senior Machine Learning Engineer to join our new Knowledge Enrichment team at BenchSci. You will help design and implement ML-based approaches to analyse, extract and generate knowledge from complex biomedical data such as experimental protocols and from results from several heterogeneous sources, including both publicly available data and proprietary internal data, represented in unstructured text and knowledge graphs. The data will be leveraged in order to enrich BenchSci’s knowledge graph through classification, discovery of high value implicit relationships, predicting novel insights/hypotheses, and other ML techniques. You will collaborate with your team members in applying state of the art ML and graph ML/data science algorithms to this data. You are comfortable working in a team that pushes the boundaries of what is possible with cutting edge ML/AI, challenges the status quo, is laser focused on value delivery in a fail-fast environment.
You Will:
Analyse and manipulate a large, highly-connected biological knowledge graph constructed of data from multiple heterogeneous sources, in order to identify data enrichment opportunities and strategies
Work with data and knowledge engineering experts to design and develop knowledge enrichment approaches/strategies that can exploit data within our knowledge graph
Provide solutions related to classification, clustering, more-like-this-type querying, discovery of high value implicit relationships, and making inferences across the data that can reveal novel insights
Deliver robust, scalable and production-ready ML models, with a focus on optimising performance and efficiency
Architect and design ML solutions, from data collection and preparation, model selection, training, fine-tuning and evaluation, to deployment and monitoring
Collaborate with your teammates from other functions such as product management, project management and science, as well as other engineering disciplines
Sometimes provide technical leadership on Knowledge Enrichment projects that seek to use ML to enrich the data in BenchSci’s Knowledge Graph
Work closely with other ML engineers to ensure alignment on technical solutioning and approaches
Liaise closely with stakeholders from other functions including product and science
Help ensure adoption of ML best practices and state of the art ML approaches at BenchSci
Participate in and sometimes lead various agile rituals and related practices
You Have:
Minimum 5, ideally 8+ years of experience working as an ML engineer in industry
Technical leadership experience, including leading 5-10 ICs on complex projects in industry
Degree, preferably PhD, in Software Engineering, Computer Science, or a similar area
A proven track record of delivering complex ML projects working alongside high performing ML engineers using agile software development
Demonstrable ML proficiency with a deep understanding of how to utilise state of the art NLP and ML techniques
Mastery of several ML frameworks and libraries, with the ability to architect complex ML systems from scratch. Extensive experience with Python and PyTorch
Track record of successfully delivering robust, scalable and production-ready ML models, with a focus on optimising performance and efficiency
Experience with the full ML development lifecycle from architecture and technical design, through data collection and preparation, model selection, training, fine-tuning and evaluation, to deployment and maintenance
Strong skills related to implementing solutions leveraging Large Language Models, as well as a deep understanding of how to implement solutions using Retrieval Augmented Generation (RAG) architecture
Expertise in graph machine learning (i.e. graph neural networks, graph data science) and practical applications thereof. This is complimented by your experience working with Knowledge Graphs, ideally biological, and a familiarity with biological ontologies
Experience with complex problem solving and an eye for details such as scalability and performance of a potential solution
Experience with data manipulation and processing, such as SQL, Cypher or Pandas
A growth mindset continuously seeking to stay up-to-date with cutting-edge advances in ML/AI, complimented by actively engaging with the ML/AI community