MY client are transforming observability with a modern, full-stack platform that delivers logs, metrics, traces, and security monitoring — cutting costs by up to 70% while boosting efficiency. They are looking for a Lead SRE to own and elevate our Alerting & Incident Management platform . You’ll be the driving force behind reliability, customer satisfaction, and product excellence — ensuring smooth More ❯
MY client are transforming observability with a modern, full-stack platform that delivers logs, metrics, traces, and security monitoring — cutting costs by up to 70% while boosting efficiency. They are looking for a Lead SRE to own and elevate our Alerting & Incident Management platform . You’ll be the driving force behind reliability, customer satisfaction, and product excellence — ensuring smooth More ❯
RAG. Plus embeddings, vector databases, retrieval strategies, and how to measure if your AI is telling the truth. Experience building systems that handle sensitive business data securely, with proper observability and reliability. Strong communication skills, a sense of ownership, and comfort with the pace of startup life. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Circle Recruitment
solid understanding of quality engineering principles. Ability to work autonomously and collaboratively in a fast-paced, cross-functional environment. A holistic view of quality , considering everything from testability and observability to scalability and resilience. Ideal Background Degree in Computer Science, Engineering, or a related field. Proven experience in quality engineering roles with a focus on continuous improvement and cross-team More ❯
environments Real world experience delivering data quality management and data profiling Broad understanding of database designs, schema designs and data mapping Experience with tools supporting data management (governance, quality, observability, analytics) Excellent written and verbal communication, with the ability to develop clear requirements and specifications and communicate complex technical information to both technical and non-technical colleagues Excellent people skills More ❯
at least). Collaborating in Scrum ceremonies and engage with cross-functional teams for requirements. Managing CI/CD pipelines for automated deployments and reliability. Monitoring system health with observability tools and address issues proactively. Engaging with stakeholders for alignment on project goals and updates. Researching new technologies to improve the Snowplow ecosystem. We'd love to hear from you More ❯
and the adoption of new standards and protocols within the payment ecosystem Collaborate on Process & Tooling Automation: Work closely together with internal teams, including Merchant Data Analytics and Merchant Observability, to design, develop, and implement standardized, repeatable workflows, robust monitoring systems, and automated tools Who you are: 5+ years of experience in payments, acquiring, or card networks, demonstrating a unique More ❯
Databases (Mongo) Test automation following Test Driven Development Practices including Unit, Integration and end-to-end testing Supporting a highly-available production system, diagnosing issues raised from logs and observability tooling (Dynatrace), triage and resolution. Company Benefits A Competitive Salary, Pension Scheme and Life Assurance Along with 25 Days Annual Leave plus an Additional Day on us for your Birthday More ❯
release strategy balancing risk and speed of delivery Assisting the team with support process and incident management Pairing with other team members and encourage a focus on quality and observability in the team Working closely with the Product Owner and Delivery Lead on prioritizing product features and uncovering edge cases Managing the issue backlog and facilitating bug triage based on More ❯
mesh, API gateways, and commercial vs. open source software. Approaches to managing Architectural debt, Architecture governance and evolution in practice Micro services topologies, including operational concerns such as resiliency, observability, discovery and routing, security etc. Have experience with, and understand how to lead, legacy integration and remediation (facades, strangler approaches, et. al.). Deep understanding of different integration patterns and More ❯
for a DevOps Engineer with strong site reliability principles to join our Platform team. You’ll focus on maintaining and improving production reliability, automating operational tasks, and enhancing our observability stack. You’ll work closely with SREs, support engineers, release managers, and incident managers to ensure our systems meet SLIs, SLOs, and SLA targets. Key Responsibilities Maintain and optimise production … Proficient with AWS services relevant to production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong GitOps mindset for managing More ❯
for a DevOps Engineer with strong site reliability principles to join our Platform team. You’ll focus on maintaining and improving production reliability, automating operational tasks, and enhancing our observability stack. You’ll work closely with SREs, support engineers, release managers, and incident managers to ensure our systems meet SLIs, SLOs, and SLA targets. Key Responsibilities Maintain and optimise production … Proficient with AWS services relevant to production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong GitOps mindset for managing More ❯
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
london (city of london), south east england, united kingdom
Harnham
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
london (city of london), south east england, united kingdom
Hlx Life Sciences
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
London, England, United Kingdom Hybrid / WFH Options
Hays
SQL for validation and analysis Experience working with offshore engineering partners Proficient with tools such as JIRA, Confluence, Figma and an API documentation platform Nice to have Exposure to observability and support workflows, for example Grafana or similar Experience in UX or service design research and translating insights into requirements Familiarity with cloud platforms and CI or CD concepts Background More ❯
london, south east england, united kingdom Hybrid / WFH Options
Hays
SQL for validation and analysis Experience working with offshore engineering partners Proficient with tools such as JIRA, Confluence, Figma and an API documentation platform Nice to have Exposure to observability and support workflows, for example Grafana or similar Experience in UX or service design research and translating insights into requirements Familiarity with cloud platforms and CI or CD concepts Background More ❯
London, Lime Street, United Kingdom Hybrid / WFH Options
Hays Technology
SQL for validation and analysis Experience working with offshore engineering partners Proficient with tools such as JIRA, Confluence, Figma and an API documentation platform Nice to have Exposure to observability and support workflows, for example Grafana or similar Experience in UX or service design research and translating insights into requirements Familiarity with cloud platforms and CI or CD concepts Background More ❯
and improving experimental design for marketing applications Apply generative AI, reinforcement learning and other ML techniques to improve operational efficiency and campaign outcomes through automation and scalability Implement monitoring, observability and guardrails for AI systems to ensure performance, reliability, compliance and safety Identify and address algorithm risks or inefficiencies before they impact marketing performance Collaborate with Marketing, Engineering, Science and More ❯
and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Configuration Management Ansible Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) GitHub Actions (CI/CD pipelines) What They’re Looking For Experience in AWS cloud infrastructure (ideally in a … regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) A good communicator who enjoys working collaboratively across product and engineering The client is willing to take someone that doesn't More ❯