related field. 5+ years of experience as a Site Reliability Engineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running in AWS Comfortable with Infrastructure as Code, Terraform is preferred Comfortable with CI/CD pipelines such as GitHub More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Warwick, Warwickshire, United Kingdom Hybrid / WFH Options
ICEO
to implement redundancy and disaster recovery scenarios. Track record in scaling high-efficiency production systems. Proficiency with observability tools (e.g., Prometheus, Grafana, Grafana Mimir, OpenTelemetry). Strong written and spoken English (B2 level or higher). Nice to Have: Experience with Argo CD and Argo Rollouts. Familiarity with technologies such More ❯
Code (IaC) : Proficiency with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation. Distributed Tracing : Experience with distributed tracing tools like Jaeger or OpenTelemetry for debugging microservices. Security : Strong knowledge of securing microservices, Kubernetes clusters, and cloud-based applications. Additional Information We believe that coming together as a community More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
Tokenovate
API gateways, service mesh architectures, and cloud-native security patterns. Exposure to compliance and auditing requirements in regulated industries. Experience with modern observability stacks (OpenTelemetry, Prometheus, Grafana, Datadog, etc.). Familiarity with authentication and authorisation protocols (OAuth2, OIDC, SAML). WHY JOIN US? Cutting-Edge Work – Build innovative solutions at More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Tokenovate
API gateways, service mesh architectures, and cloud-native security patterns. Exposure to compliance and auditing requirements in regulated industries. Experience with modern observability stacks (OpenTelemetry, Prometheus, Grafana, Datadog, etc.). Familiarity with authentication and authorisation protocols (OAuth2, OIDC, SAML). WHY JOIN US? Cutting-Edge Work – Build innovative solutions at More ❯
Cambridge, south west england, United Kingdom Hybrid / WFH Options
Tokenovate
API gateways, service mesh architectures, and cloud-native security patterns. Exposure to compliance and auditing requirements in regulated industries. Experience with modern observability stacks (OpenTelemetry, Prometheus, Grafana, Datadog, etc.). Familiarity with authentication and authorisation protocols (OAuth2, OIDC, SAML). WHY JOIN US? Cutting-Edge Work – Build innovative solutions at More ❯
on experience with containerization (Docker, Kubernetes). Strong security mindset with experience in compliance frameworks (SOC, PCI, GDPR). Familiarity with monitoring tools like OpenTelemetry, Instana, or LogicMonitor. Scripting experience (Ruby, Python, Bash) for automation and infrastructure management. More ❯
have focus on Observability. Excellent knowledge and hands-on experience with monitoring, logging, and tracing tools such as Prometheus, VictoriaMetrics, Grafana, Datadog, New Relic, OpenTelemetry, ELK Stack, or similar. Experience with high volume data storage (Structured and unstructured). A strong technical background, with current capabilities and willingness to get More ❯
databases (ideally Postgres, MongoDB). Experience of event streaming (Apache Kafka) would also be beneficial. Familiarity with observability platforms such as Grafana, Zabbix, Prometheus, OpenTelemetry/SigNoz. Experience of mobile telecoms principles and platforms would be advantageous but is not mandatory (such as EPC, DIAMETER/SS7 signalling, GTP and More ❯
transform the TechOps team Participate in the operational management of OpenShift Work with technologies such as Ansible, PowerShell, C#, SQL Server, Elastic Grafana, Prometheus, OpenTelemetry, Bare-metal builds, Hyper-V automation What we are looking for: Experience in TechOps, especially with Infrastructure as Code Familiarity with development technologies like C# More ❯
roles (e.g. Solutions Architect, Sales Engineering, Pre-Sales). Background in enterprise SaaS, especially in infrastructure monitoring, analytics, or APM. Hands-on expertise with OpenTelemetry, Kubernetes, and modern cloud-native observability stacks. Familiarity with streaming data and real-time metric processing. Experience working in Agile environments and across the full More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Tbwa Chiat/Day Inc
or DBaaS environment. Strong understanding of cloud infrastructure components (e.g., compute, storage, networking) and their cost drivers. Experience with observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and a deep understanding of monitoring and alerting best practices. Exceptional communication skills, capable of articulating complex technical concepts to diverse audiences. Demonstrated ability to More ❯
At Anaplan, we are a team of innovators who are focused on optimizing business decision-making through our leading scenario planning and analysis platform so our customers can outpace their competition and the market. What unites Anaplanners across teams and More ❯
Dundee, Angus, United Kingdom Hybrid / WFH Options
Ivanti
Tools: Help deploy and manage observability platforms such as Azure Application Insights (AppInsights), New Relic, Prometheus, and Grafana. Support Distributed Tracing & Telemetry: Work with OpenTelemetry to collect and export telemetry data for better system insights and debugging. Optimize Logging & Metrics Collection: Assist in implementing structured logging and improving system performance … experience in observability, monitoring, or DevOps-related roles. Basic experience with monitoring tools such as Azure AppInsights, New Relic, Prometheus, and Grafana. Understanding of OpenTelemetry, New Relic, AppInsights APM for telemetry data collection. Familiarity with AWS and Azure cloud environments. Exposure to Kubernetes and container monitoring. Basic scripting knowledge (Python More ❯
Software Engineer – Energy and Trading Sector Location: London (2 days per week) Rate: Inside IR35 as provided by the contractor About the Role: We are seeking an experienced Software Engineer to join our global client's dynamic team in the More ❯
JD Sports- Head Office, Warwick House, Bury, Bury, United Kingdom Req 09 March 2025 Established in 1981 with a single store in the Northwest of England, the JD Group is a leading omni-channel retailer of Sports Fashion, Outdoors and More ❯
At Anaplan, we are a team of innovators who are focused on optimizing business decision-making through our leading scenario planning and analysis platform so our customers can outpace their competition and the market. What unites Anaplanners across teams and More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
bet365 Group
A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor More ❯
This is a Vice President position within Platform Reliability Engineering and Management leveraging SRE Principles and Practices based out of London. This role is looking for a multi skilled professional with strong technical leadership, people management skills to deliver critical More ❯