Bioinformatician | Python | Nextflow | Pangenomes | RNA | Bioinformatics | Hybrid, London
Bioinformatician | Python | Nextflow | Pangenomes | RNA | Bioinformatics | Hybrid, London
🌽 Safeguarding the future of food
The company is focused on accelerating the development of more productive, sustainable, nutritious, and climate-resilient food sources. To achieve this, it is building a machine learning–driven target discovery platform for crop gene editing.
🧬 Target discovery for gene editing
While gene editing in crops is becoming increasingly efficient, identifying which genes to edit remains a key challenge. The company uses advanced deep learning techniques—including transformers, graph-based models, and causal machine learning—to identify high-value genetic targets for crop improvement.
🤖 Why (pan)genomics matters
Advances in biological machine learning increase the importance of high-quality data curation. Ensuring that high-quality omics data is properly onboarded, quality-controlled, and accessible is critical. In particular, pangenomics enables new modelling and trait development opportunities when processed at scale.
👥 Team
The company is an early-stage, interdisciplinary team working closely across machine learning, data, and scientific product functions.
The Role
As part of the Computational Biology team, you will lead the development of robust, scalable omics workflows supporting the discovery platform, with an initial focus on plant pangenomes.
You will:
- Review and improve existing pipelines
- Define how pangenomes contribute to discovery projects
- Build computational systems connecting omics data to downstream ML applications
- Work cross-functionally across data, ML, and plant science
Initial Priorities
- Review existing omics pipelines, focusing on pangenomes and RNA-seq
- Define and implement a strategy to improve discovery workflows using pangenomes
Core Responsibilities
- Own pangenome creation and curation
- Maintain and improve omics pipelines and data quality
- Drive innovation across pipelines (pangenomes, gene expression, variant calling)
- Collaborate with data teams on data onboarding and processing
- Support public data QC and ingestion
Additional Responsibilities
- Support internal discovery projects
- Contribute to maintaining high-quality scientific codebases
- Optimise workflow resource usage (e.g., Nextflow pipelines)
Core Competencies
- Extensive experience with plant pangenomes
- Strong experience with Nextflow and omics pipelines
- Proficiency in Python
- Familiarity with common omics formats (FASTQ, VCF, HAL, GFF)
- Experience with Linux and bioinformatics tools (e.g., seqkit, bcftools, samtools)
- Knowledge of public omics repositories (e.g., NCBI, Ensembl, JGI)
- Strong cross-functional communication skills
- Ability to translate scientific workflows into production-grade software
- Experience in scientific or regulated environments requiring reproducibility
Nice to Have
- Familiarity with nf-core standards
- Experience with RNA-seq mapping in pangenomes
- Experience with version control systems (e.g., Git)
Benefits
- Competitive compensation and equity options
- Generous annual leave and flexible working options
- Benefits package
- Career development opportunities
- Ownership of impactful, mission-driven work
- Collaborative and supportive work environment
- Access to conferences and professional development
Bioinformatician | Python | Nextflow | Pangenomes | RNA | Bioinformatics | Hybrid, London