/CD and GitOps practices with GitHub Actions and ArgoCD, including automated testing, vulnerability scanning, and environment promotion workflows. Drive the definition and implementation of observability standards - Prometheus, Grafana, Loki/ELK, Jaeger, Sentry - enabling end-to-end visibility and SLA tracking. Define scalability and reliability patterns (KEDA, HPA, circuit breakers, bulkheads, caching tiers) and ensure resilience of critical More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
RegGenome
learn. Hands-on experience with Kubernetes and Terraform/Terragrunt/OpenTofu. Strong cloud infrastructure knowledge in either AWS or GCP. Nice to Have: Monitoring stack tools: Prometheus, Thanos, Loki, Alertmanager, Grafana. CI/CD experience with FluxCD (or ArgoCD). Database performance optimization and management experience. Qualities We Value: Solution-oriented mindset with a knack for solving tough More ❯
LL BRING: Proven experience in observability, SRE, or platform engineering roles within complex, distributed environments. Strong hands-on expertise with telemetry tools such as OpenTelemetry, Prometheus, Grafana, Splunk, Elastic, Loki, Jaeger, or similar . Proficiency in at least one programming language (e.g., Python, Go, Java) and infrastructure-as-code tools (e.g., Terraform, Helm). Deep understanding of cloud-native More ❯
GitOps practices Expertise in cloud platforms (AWS, GCP, Azure) and cloud architecture; certifications are a plus Experience with Kubernetes, Docker, and microservices, as well as monitoring tools (Prometheus, Grafana, Loki, Mimir) Strong experience in Infrastructure as Code (IaC) and configuration management (especially Terraform) Responsibilities: As a Senior DevOps Engineer (f/m/d), you will be responsible for More ❯
teams, helping to establish telemetry standards, efficient usage patterns, and scalable platform abstractions. Ability to make forward-looking technical decisions and lead others through ambiguity. Familiarity with ClickHouse, Grafana Loki, Athena, or equivalent systems for log and metrics querying. Contributions to open-source observability tools or communities. Experience building cost visibility or FinOps tooling for cloud compute and telemetry More ❯
leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on: Designing and scaling observability More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Motive Group
leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on: Designing and scaling observability More ❯
london, south east england, united kingdom Hybrid / WFH Options
Motive Group
leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on: Designing and scaling observability More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Motive Group
leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on: Designing and scaling observability More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Motive Group
leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on: Designing and scaling observability More ❯
Belfast, Northern Ireland, United Kingdom Hybrid / WFH Options
ViVA Tech Talent
ensuring the product runs smoothly at scale. What You’ll Be Doing Designing and managing observability tooling across monitoring, logging, and alerting. Leading the rollout of Grafana, Prometheus, and Loki , and evolving the observability stack. Building and maintaining infrastructure on AWS , using Terraform for everything IaC. Partnering with product and platform teams to define and track SLIs/SLOs. … Troubleshooting and improving production systems, driving reliability best practices. What We’re Looking For Solid experience operating production-grade observability systems (Grafana/Prometheus/Loki or similar). Strong AWS skills and deep familiarity with Infrastructure as Code (Terraform). A collaborative engineer who enjoys shaping standards and mentoring others. Bonus: exposure to OpenTelemetry, Kamal, or modern deployment More ❯
lisburn, antrim, united kingdom Hybrid / WFH Options
ViVA Tech Talent
ensuring the product runs smoothly at scale. What You’ll Be Doing Designing and managing observability tooling across monitoring, logging, and alerting. Leading the rollout of Grafana, Prometheus, and Loki , and evolving the observability stack. Building and maintaining infrastructure on AWS , using Terraform for everything IaC. Partnering with product and platform teams to define and track SLIs/SLOs. … Troubleshooting and improving production systems, driving reliability best practices. What We’re Looking For Solid experience operating production-grade observability systems (Grafana/Prometheus/Loki or similar). Strong AWS skills and deep familiarity with Infrastructure as Code (Terraform). A collaborative engineer who enjoys shaping standards and mentoring others. Bonus: exposure to OpenTelemetry, Kamal, or modern deployment More ❯
newtownabbey, antrim, united kingdom Hybrid / WFH Options
ViVA Tech Talent
ensuring the product runs smoothly at scale. What You’ll Be Doing Designing and managing observability tooling across monitoring, logging, and alerting. Leading the rollout of Grafana, Prometheus, and Loki , and evolving the observability stack. Building and maintaining infrastructure on AWS , using Terraform for everything IaC. Partnering with product and platform teams to define and track SLIs/SLOs. … Troubleshooting and improving production systems, driving reliability best practices. What We’re Looking For Solid experience operating production-grade observability systems (Grafana/Prometheus/Loki or similar). Strong AWS skills and deep familiarity with Infrastructure as Code (Terraform). A collaborative engineer who enjoys shaping standards and mentoring others. Bonus: exposure to OpenTelemetry, Kamal, or modern deployment More ❯