Senior Platform Engineer

Senior Platform Engineer

Collect Solutions

Location

London / Hybrid

Working style

Remote-first with periodic in-person working sessions

Reports to

CTO (fractional; Gregory Wheeler, Martingale GmbH)

Works with

Principal ML Engineer (Zifan); CTO; Founders

Start date

March 2026

Probation

3 months

Role type

Full-time, permanent

Salary & Benefits

£70-95k, Bonus, Equity, Other benefits

About the role

Collect Solutions is a lean, funded, AI-powered company building software solutions to help businesses collect payments faster and more intelligently. Our products combine agentic AI workflows with payment system integrations to automate collection processes that are currently manual, slow, and error-prone.

We need a senior engineer who will become the day-to-day technical anchor of the platform – someone with strong Azure expertise who owns the production system end-to-end, can build features as well as infrastructure, and brings the operational discipline to run a financial services application reliably.

This is not a pure DevOps role. You will spend significant time on platform engineering, CI/CD, and infrastructure – but you will also contribute to application development, API design, and integration work. We need a generalist who leans platform, not an infrastructure specialist who only touches pipelines.

What you’ll do

Platform and Infrastructure

• Own the Azure production environment end-to-end: resource provisioning, networking, identity and access management, secrets, backups, and cost management.

• Build and maintain CI/CD pipelines (GitHub Actions or Azure DevOps): predictable releases, safe rollbacks, environment promotion (dev/staging/prod), and clean configuration management.

• Establish and maintain Infrastructure as Code (Terraform or Bicep) so that environments are reproducible, auditable, and not manually configured.

• Set up observability: structured logging, metrics, dashboards (Azure Monitor / Grafana), alerting, and runbooks that enable fast diagnosis and recovery.

• Define and track availability and reliability targets; drive post-incident review and preventative improvements.

Application Development

• Contribute to backend development: API endpoints, service logic, database schema, and integration code. This is a building role, not only an operations role.

• Design and implement integration patterns for third-party payment providers, accounting systems, priority is Xero first, and webhooks – with appropriate retry logic, idempotency, and reconciliation.

• Build and operate async processing infrastructure: queues, workers, schedulers, with attention to reliability patterns (dead-letter queues, safe reprocessing, backpressure).

AI/ML Operations Support

• Partner with the ML engineer on production deployment of AI/ML models and agentic workflows: containerisation, versioning, controlled rollouts, and regression testing.

• Implement operational controls for LLM usage in production: rate limiting, cost monitoring, caching, audit trails, and safe fallback behaviour.

• Support evaluation and testing infrastructure for ML components alongside the ML engineer.

Security and Compliance Foundations

• Implement security fundamentals appropriate to a payments platform: encryption at rest and in transit, least-privilege access, dependency scanning, and secure secret management.

• Establish audit trail and logging patterns that support future compliance requirements (the product handles financial data and payment instructions). Including ISO accreditations.

• Participate in on-call as the platform scales, with the expectation that the early team shares responsibility for production availability.

Team and Process

• Work in short iterative cycles (sprints) with the CTO and ML engineer; participate in planning, estimation, and retrospectives.

• Communicate directly with founders on technical status, trade-offs, and priorities – translating engineering decisions into business terms when needed.

• Write documentation that the team can use: architecture decisions, runbooks, onboarding guides. You are building the engineering culture, not just the code.

• Evaluate build-vs-buy decisions for tooling and infrastructure components alongside the CTO.

What we’re looking for

Required

• 5+ years of experience in a platform engineering, DevOps, or backend-heavy full-stack role, with meaningful time spent running production SaaS systems.

• Strong Azure experience: resource management, networking, identity (Entra ID), and at least working familiarity with AKS or App Service, Azure Functions, Service Bus, and Azure Monitor.

• Solid Infrastructure as Code practice (Terraform preferred; Bicep acceptable).

• Experience building and maintaining CI/CD pipelines in a team environment (GitHub Actions or Azure DevOps).

• Competence in at least one backend language (Python strongly preferred, given the ML stack; Go, C#, or TypeScript also valuable).

• Experience with containers (Docker, Kubernetes or equivalent) and modern deployment patterns (blue-green, canary, feature flags).

• Production experience with monitoring, logging, and alerting systems, and a track record of responding to incidents methodically.

• Clear written and verbal communication. You will work closely with non-technical founders and need to explain trade-offs without jargon.

• Ability to support and nurture early customers through the specification and accreditation for accounting application and AI/ML development from pilot to commercialisation

Strongly preferred

• Experience in fintech, payments, or financial services – you understand why idempotency, audit trails, and data integrity matter more here than in most domains.

• Familiarity with LLM/AI production operations: model versioning, evaluation pipelines, cost management, safe rollout patterns.

• Experience with third-party API integrations at scale, including handling real-world failure modes (rate limits, partial failures, webhook reliability).

• Exposure to multi-tenant SaaS architecture and the operational considerations that come with it.

• Experience working in small teams or early-stage companies where you wore multiple hats and owned outcomes, not just tasks.

What this role is not

To set expectations clearly: this is a senior individual contributor role in a small, early-stage team. It is not a management role (there is no team to manage yet), and it is not a pure infrastructure role where you only touch Terraform and pipelines. You will write application code. You will debug production issues at 2am occasionally. You will have opinions about product decisions because you understand the technical constraints. That breadth is the point.

Why this role matters

You will be the engineering anchor of a growing platform. You are the person who makes the production system real, keeps it running, and builds the engineering foundation that everything else sits on. As the company grows, you will have shaped the platform, the practices, and the culture from the ground up.

Equal opportunities

Collect Solutions is an equal opportunity employer. We value diverse perspectives and are committed to creating an inclusive environment for all.

Job Details

Company
collects.io
Location
City of London, London, United Kingdom
Hybrid / Remote Options
Posted