Bath, Somerset, United Kingdom Hybrid/Remote Options
Seccl Technology Limited
handling, JWK publishing, and SSO connection setup. Utilising Infrastructure as Code (Terraform) and CI/CD (GitHub Actions) to manage Auth0 configuration and ensure safe, repeatable deployments. Implementing comprehensive observability for authentication paths with structured logs, monitoring dashboards, alerts, and SLOs. Collaborating closely with product, engineering, and support teams on migration timelines, communications, and incident response. This role's for … and identity configurations, including secure secrets management. Solid understanding of core AWS services relevant to modern authentication patterns, such as API Gateway, Lambda authorisers, and CloudWatch. A commitment to observability, with hands-on experience implementing structured logging, dashboards, and SLOs for critical services. Excellent collaboration skills, demonstrated through participation in design reviews, pairing, and writing clear technical documentation (e.g., runbooks More ❯
Derby, England, United Kingdom Hybrid/Remote Options
Experis UK
Head of Cloud – Contract (Outside IR35) Location: Hybrid (East Midlands/London 1-2 days/week onsite) Rate: Up to £700/day Contract Type: Outside IR35 Duration: 3-6 months (initial), with potential extension Start Date: ASAP About More ❯
Job Title: Staff Data Engineer Location: London, Hybrid Salary: c.£140,000 + bonus + share options Why Apply? This is a unique opportunity to take a leading role in shaping the data strategy of a fast growing Insurtech scale More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Motive Group
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯
AND RESPONSIBILITIES Support and enhance the company's infrastructure and production systems across GCP. Contribute to a major replatforming project to GKE, ensuring scalability, automation, and security. Improve reliability, observability, and CI/CD pipelines (GitLab CI, Argo, Flux). Work closely with developers to embed best practices and elevate the internal developer experience (IDP). Collaborate within a small … with GCP (Google Cloud Platform) . Deep understanding of CI/CD pipelines - GitLab CI, Argo, or Flux. Experience with HashiCorp Vault and open-source tooling. Background in automation, observability, and platform reliability . Excellent problem-solving skills and a collaborative, pragmatic mindset. THE DETAILS Day rate: £550-£650/day (Outside IR35) Contract: 3 months, with scope for extension … AND RESPONSIBILITIES Support and enhance the company's infrastructure and production systems across GCP. Contribute to a major replatforming project to GKE, ensuring scalability, automation, and security. Improve reliability, observability, and CI/CD pipelines (GitLab CI, Argo, Flux). Work closely with developers to embed best practices and elevate the internal developer experience (IDP). Collaborate within a small More ❯
East London, London, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
City of London, London, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Bury, Greater Manchester, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Leigh, Greater Manchester, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Bolton, Greater Manchester, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Altrincham, Greater Manchester, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Leeds, West Yorkshire, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Central London / West End, London, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
Ashton-Under-Lyne, Greater Manchester, United Kingdom Hybrid/Remote Options
Opus Recruitment Solutions
AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | Observability Are you looking for a genuinely Remote opportunity? Somewhere you're part of something bigger, working on a global product within a close-knit SRE team? I've partnered a WebApp that provide an end to end event management for some of the … planet's biggest artists and they're now looking for a SRE. Someone that knows their way around classic Observability with Grafana, ELK stack, and cost optomisation for the product as they continue scaling. Working across the glove their multi-tenanted, AWS environments requires someone who is able to reverse engineer product faults, or post incident audits to ensure future … like to hear more, send over a CV to robin.shaw@opusrs.com or apply! AWS | GCP | SRE | Site Reliability Engineer | Terraform | Cloudformation | ECS | ELK | Elasticsearch | Logstash | Kabana | Cloudwatch | Grafana | Windows | ObservabilityMore ❯
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Identify Solutions
the past year and aggressive expansion across the UK, US, and EU, the company is scaling at pace. Data is the backbone: from APIs and pipelines to governance and observability, their data platform directly powers customer-facing products and AI-driven insights. They’re now hiring a Senior Data Engineer to own and shape this platform, building scalable, production-grade … systems that become the foundation for global brands. Why join? ✨ Greenfield impact – inherit a live but early platform, define best practice across structure, testing, observability, and governance. ✨ Direct product impact – your APIs, pipelines, and orchestration power the platform that 1,000+ brands rely on every day. ✨ AI at the core – work on infrastructure that enables machine learning and intelligent decision … doing: API strategy & development – own and scale FastAPI endpoints that deliver real-time access to platform data. Data pipeline development – build ingestion and replication pipelines with best-in-class observability, latency, and resilience. Platform technical vision – influence architecture and orchestration, shaping how the business handles data at scale. Data quality & governance – embed testing, freshness, lineage, and monitoring to ensure reliability More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Stanford Black Limited
low latency network environment . You’ll be joining a collaborative and forward-thinking environment with flat structures and deep technical expertise – ideal for someone who enjoys network automation, observability tooling, and IaC . Below I have included a breakdown of the role, company, and requirements. Please review and if the opportunity seems like a good fit share your CV … opportunities for network automation and implement appropriately IaC heavy environment - work with likes of Ansible, Python, CI/CD, GitOps practices Deliver troubleshooting, operational enhancements, and BAU changes Develop observability tooling (dashboards, alerts) and build self-healing or event-driven automation Lead post-incident reviews and trend analysis to continuously improve network reliability and performance Company: Technology-led culture – Drives … investment firm Vendors – Arista, Cisco, Corvil, Nvidia (All not necessary, but the more the better) Python/Golang for Network Automation Proven experience with low latency networking Monitoring and observability tooling such as Nagios, Solarwinds, Prometheus, Alertmanager, Grafana Analysis tooling i.e. Wireshark, Splunk, PromQL Familiarity with IaC/DevOps tools such as Ansible, GitOps, CI/CD Exposure to vendor More ❯
low latency network environment . You’ll be joining a collaborative and forward-thinking environment with flat structures and deep technical expertise – ideal for someone who enjoys network automation, observability tooling, and IaC . Below I have included a breakdown of the role, company, and requirements. Please review and if the opportunity seems like a good fit share your CV … opportunities for network automation and implement appropriately IaC heavy environment - work with likes of Ansible, Python, CI/CD, GitOps practices Deliver troubleshooting, operational enhancements, and BAU changes Develop observability tooling (dashboards, alerts) and build self-healing or event-driven automation Lead post-incident reviews and trend analysis to continuously improve network reliability and performance Company: Technology-led culture – Drives … investment firm Vendors – Arista, Cisco, Corvil, Nvidia (All not necessary, but the more the better) Python/Golang for Network Automation Proven experience with low latency networking Monitoring and observability tooling such as Nagios, Solarwinds, Prometheus, Alertmanager, Grafana Analysis tooling i.e. Wireshark, Splunk, PromQL Familiarity with IaC/DevOps tools such as Ansible, GitOps, CI/CD Exposure to vendor More ❯
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
frameworks that support thousands of real-time processes across global markets. This isn’t a maintenance role - it’s an opportunity to modernise the firm’s CI/CD, observability, and runtime environments from the ground up. What you’ll be doing: Engineering and optimising CI/CD pipelines and container orchestration at scale Modernising Linux-based deployment and runtime … low-latency environment What we’re looking for: 5+ years’ experience in DevOps, Systems, or Platform Engineering Deep knowledge of Linux, Python, and shell scripting Proven experience with Kubernetes, observability tooling, and CI/CD frameworks Strong grasp of distributed systems and performance tuning Trading, Crypto or Hedge Fund Experience Experience working in low-latency, high-frequency trading environments Why More ❯
integrate AI models for predictive insights and intelligent recommendations. Connect AI agents with core enterprise systems and data platforms. Define technical standards, reusable patterns, and governance principles. Ensure reliability, observability, and performance across AI solutions. Mentor engineering teams and foster best practices in design, DevOps, and model lifecycle management. Collaborate with data scientists, product managers, and architects to align solutions … patterns and scalable system design. Excellent communication and leadership skills, with the ability to influence technical direction. Desirable: Experience in computer vision, IoT, or multi-agent systems. Knowledge of observability and monitoring frameworks for AI operations. Apply now for immediate consideration. No sponsorship available; applicants must have the right to work in the UK. More ❯
to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with ~30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you’ll play a pivotal role in shaping the technical foundations of More ❯
to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with ~30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you’ll play a pivotal role in shaping the technical foundations of More ❯
agent context & decision making. Develop backend infrastructure and intelligent automation for knowledge crawling, extraction and enrichment. Help shape and contribute towards our DevOps practices (CI/CD, cloud infrastructure, observability). Stay on the frontier of AI, keeping up to date with emerging tools and technologies to keep us at the edge of what’s possible. Requirements Deep backend expertise … with Python or Javascript. Strong knowledge of DevOps practices, including CI/CD, cloud infrastructure & observability/monitoring. Previous full ownership of a successful high volume system. Naturally articulate and able to communicate complex concepts clearly. Evidence of excellence at everything you do. Organic curiosity and obsession for AI and cutting edge technology. Exceptional problem-solving skills and meticulous attention More ❯