Bioinformatician | Python | Nextflow | Pangenomes | RNA | Bioinformatics | Hybrid, London

🌽 Safeguarding the future of food

The company is focused on accelerating the development of more productive, sustainable, nutritious, and climate-resilient food sources. To achieve this, it is building a machine learning–driven target discovery platform for crop gene editing.

🧬 Target discovery for gene editing

While gene editing in crops is becoming increasingly efficient, identifying which genes to edit remains a key challenge. The company uses advanced deep learning techniques—including transformers, graph-based models, and causal machine learning—to identify high-value genetic targets for crop improvement.

🤖 Why (pan)genomics matters

Advances in biological machine learning increase the importance of high-quality data curation. Ensuring that high-quality omics data is properly onboarded, quality-controlled, and accessible is critical. In particular, pangenomics enables new modelling and trait development opportunities when processed at scale.

👥 Team

The company is an early-stage, interdisciplinary team working closely across machine learning, data, and scientific product functions.

The Role

As part of the Computational Biology team, you will lead the development of robust, scalable omics workflows supporting the discovery platform, with an initial focus on plant pangenomes.

You will:

Review and improve existing pipelines
Define how pangenomes contribute to discovery projects
Build computational systems connecting omics data to downstream ML applications
Work cross-functionally across data, ML, and plant science

Initial Priorities

Review existing omics pipelines, focusing on pangenomes and RNA-seq
Define and implement a strategy to improve discovery workflows using pangenomes

Core Responsibilities

Own pangenome creation and curation
Maintain and improve omics pipelines and data quality
Drive innovation across pipelines (pangenomes, gene expression, variant calling)
Collaborate with data teams on data onboarding and processing
Support public data QC and ingestion

Additional Responsibilities

Support internal discovery projects
Contribute to maintaining high-quality scientific codebases
Optimise workflow resource usage (e.g., Nextflow pipelines)

Core Competencies

Extensive experience with plant pangenomes
Strong experience with Nextflow and omics pipelines
Proficiency in Python
Familiarity with common omics formats (FASTQ, VCF, HAL, GFF)
Experience with Linux and bioinformatics tools (e.g., seqkit, bcftools, samtools)
Knowledge of public omics repositories (e.g., NCBI, Ensembl, JGI)
Strong cross-functional communication skills
Ability to translate scientific workflows into production-grade software
Experience in scientific or regulated environments requiring reproducibility

Nice to Have

Familiarity with nf-core standards
Experience with RNA-seq mapping in pangenomes
Experience with version control systems (e.g., Git)

Benefits

Competitive compensation and equity options
Generous annual leave and flexible working options
Benefits package
Career development opportunities
Ownership of impactful, mission-driven work
Collaborative and supportive work environment
Access to conferences and professional development

Apply Now

Bioinformatician | Python | Nextflow | Pangenomes | RNA | Bioinformatics | Hybrid, London

Job Details