The Brain Tumor Institute (BTI) Bioinformatics Core at Children’s National Hospital is seeking a highly skilled Senior Bioinformatics Scientist/Engineer to join our team. This position will play a critical role in advancing research of multiple PIs focused on uncovering oncogenic mechanisms in pediatric brain tumors and identifying novel therapeutic targets. The Senior Bioinformatics Scientist will engage in basic and translational research projects and contribute to tool development, such as interactive applications for visualizing complex genomic data.
The role involves close collaboration with researchers and clinicians within both Children’s National as well as external partners. The successful candidate will report to the Director of the BTI Bioinformatics Core and lead workflow creation and implementation using CWL and/or NextFlow, benchmark new core pipelines, contribute bioinformatics analyses to focused projects based on PI needs, participate in collaborative activities in the BTI such as code review and/or workshop training, and contribute to grant applications and scientific manuscripts. In addition, this candidate will support core engineering needs such as database/API/UI development and automation.
Key Responsibilities:
● Collaborate with bioinformatics scientists and PIs to benchmark and optimize new production-scale analysis pipelines and workflows to generate high quality and high data integrity outputs.
● Support project-specific engineering needs, such as database/API/UI development.
● Collaborate with IT to ensure AWS IAM and bucket security and optimize resource use.
● Create and maintain clear documentation for data engineering workflows, including codebases, data pipelines, validation, testing, and CI/CD processes.
● Perform high-quality bioinformatics analyses on pediatric oncology datasets, including genomic, transcriptomic, and epigenomic data.
● Design and implement downstream analytical workflows for high-throughput data using GitHub, Docker, and AWS infrastructure, focusing on reproducibility, code efficiency, and scalability.
● Utilize cloud-computing environments (e.g., AWS EC2) and/or high-performance computing (HPC) to support large-scale or memory-intensive analyses.
● Actively and positively participate in sprints and code reviews, ensuring high standards for reproducibility and documentation.
● Engage with multidisciplinary teams, providing bioinformatics expertise to support collaborative research initiatives.
Application Process:
This position will be remote. Candidates should be prepared to share their GitHub handle and present a recent project as part of the interview process.
-------------------------------------------------------------------------------------------------------------------------------------
Build scalable, production ready machine learning and statistical models to improve healthcare data latency through automation. This role will focus on advanced statistical and machine learning solutions collecting, cleansing, interpreting large volumes of data from varying sources, designing and delivering production ready models, monitoring and maintaining models' health in production, all while communicating key findings with stakeholders.
Preferred Skills:
● Ph.D. in Bioinformatics, Computational Biology, or a related field, or equivalent industry experience.
● At least ten years of experience in bioinformatics including cancer, with expertise in Bash, R or Python, RShiny and or Python GUI applications.
● Proficiency with cloud-based or high-performance computing environments for bioinformatics workflows.
● Strong experience with tools and best practices for reproducibility, including Git and Docker.
● Proven experience with genomic data types such as single nucleotide variants (SNVs), copy number variants, fusions, RNA expression, methylation, proteomics, splicing, and single cell datasets.
● Commitment to open science practices, including sharing and collaborating on code, data, and documentation.
● Extensive experience with current standard parallel computing and data processing workflows (eg: Snakemake, NextFlow, CWL, WDL).
● Experience diagnosing and troubleshooting pipeline errors and unexpected behaviors. This includes taking initiative whether it be debugging, online searches, contacting authors of software for assistance and generally seeking assistance as needed.
● Experience with reproducible pipeline development including software version control, use and creation of docker and/or singularity images, collaborative code review.
● Demonstrated ability to develop and implement best practices for bioinformatics systems integration, testing, and deployment is required.
● Interest in learning AWS cloud architecture, design, and automation.
● Strong organizational and project management skills, with the ability to work on multiple projects and teams.
● Excellent communication skills, with the ability to work in cross-disciplinary teams.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Minimum Education
Bachelor's Degree A Bachelor’s degree in a quantitative/statistical or business field (e.g., Statistics, Mathematics, Engineering, Computer Science). (Required)
Master's Degree Masters preferred. (Preferred)
Minimum Work Experience
6 years Requires deep functional knowledge with 6+ years of related experience or equivalent experience acquired through accomplishments of applicable knowledge, duties, scope and skill reflective of the level of this position. (Required)
Required Skills/Knowledge
Experience working in a heavily regulated industry. Healthcare is a plus.
Advanced course in machine learning and programming.
Experience working with global distributed multicultural teams.
Experience with agile leadership.
Experience with building, delivering and maintaining production ready machine learning models.
Knowledge of statistical data analysis and machine learning such as linear models, time series forecasting, neural network, random forest and NLP models, etc.
Expert in Python coding and utilization of machine learning and statistical packages for modeling.
Experience with database skills, SQL, NoSQL, coding for ETL.
In depth understanding of machine learning algorithms such as random forest, neural network, graph models, NLP, etc.
Familiarity with Spark, Azure, Databricks, MLFlow AutoML.
Experience and familiarity with backlog management tools and resources, ideally with JIRA and Confluence.
Seeks to acquire knowledge in area of specialty.
Ability to identify basic problems and procedural irregularities, collect data, establish facts, and draw valid conclusions.
Ability to work independently.
Demonstrated analytical skills.
Demonstrated project management skills.
Demonstrates a high level of accuracy, even under pressure.
Demonstrates excellent judgment and decision-making skills.
Ability to communicate and make recommendations to leadership.
Ability to drive multiple projects to successful completion.
Possesses technical aptitude.
Excellent verbal and written communication skills, communicate complex findings in a clear and understandable manner
Excellent facilitation ability to host sessions and elicit ideas from others, understanding their issues and encourage group participation
Attention to detail.
Communicate complex findings in a clear and understandable manner
Collaborate effectively with cross-functional teams
Adapt to changing priorities and thrive in a dynamic environment
Functional Accountabilities
Organizational Accountabilities
Organizational Accountabilities (Staff)
Organizational Commitment/Identification
Teamwork/Communication
Performance Improvement/Problem-solving
Cost Management/Financial Responsibility
Safety
Yearly based
District of Columbia-Washington