A level of Network understanding Awareness of different monitoring tools and protocols used for monitoring (eg. Open Telemetry, SNMP, Netconf) An understanding of what Observability is, and how a company can utilise it. Previous experience of using IT Service management (ITSM) tools like Remedy or ServiceNow. Understands how various business More ❯
comprehensive approach to data control, compliance, and security; unconstrained by their infrastructure providers. Our platform mitigates data security risks while enhancing communication, automation, and observability across data flows, enabling teams to collaborate effortlessly across the organization. We have hubs in London and New York, and we are looking for people More ❯
technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco's leading Networking, Security, Collaboration, and Observability portfolios. About The Role The Enterprise Account Executive will lead the sales process for ThousandEyes Enterprise Accounts for prospective customers and channel partners in UK. More ❯
allowing Disney Entertainment to be more data-driven. You will work closely with our partner teams to monitor and drive improvements for reliability and observability of their critical data pipelines and deliverables. This is a high-impact role where your work informs decisions affecting millions of consumers, with a direct … using, supporting, and building distributed systems in a fast-paced collaborative team environment. Responsibilities: Assist in designing and developing a platform to support incident observability and automation. This team will be required to build high quality data models and products that monitor and reports on data pipeline health and data … of the team. Collaborate with engineering teams to improve, maintain, performance tune, and respond to incidents on our big data pipeline infrastructure. Build out observability and intelligent monitoring of data pipelines and infrastructure to achieve early and automated anomaly detection and alerting. Present your research and insights to all levels More ❯
allowing Disney Entertainment to be more data-driven. You will work closely with our partner teams to monitor and drive improvements for reliability and observability of their critical data pipelines and deliverables. This is a high-impact role where your work informs decisions affecting millions of consumers, with a direct … fast-paced collaborative team environment. We also support a healthy work-life-balance. Responsibilities Assist in designing and developing a platform to support incident observability and automation. This team will be required to build high quality data models and products that monitor and reports on data pipeline health and data … with engineering teams to improve, maintain, performance tune, and respond to incidents on our big data pipeline infrastructure. Own building out key components for observability and intelligent monitoring of data pipelines and infrastructure to achieve early and automated anomaly detection and alerting. Present your research and insights to all levels More ❯
risks. Proactively reducing Mean Time to Resolution (MTTR), constantly striving for efficiency gains. Championing an anti-fragility mindset across our architecture, deployment processes, and observability practices. Elevating the customer experience as the ultimate benchmark of our reliability standards. Sharing industry best practices in SRE, ensuring our team remains at the … cloud networking, microservices architecture, and Amazon EKS. Preferred qualifications include: Prior involvement in the Fintech sector or other regulated industries. Familiarity with the Grafana observability stack. Experience in Chaos Engineering methodologies. About Convera Convera is the largest non-bank B2B cross-border payments company in the world. Formerly Western Union More ❯
risks. Proactively reducing Mean Time to Resolution (MTTR), constantly striving for efficiency gains. Championing an anti-fragility mindset across our architecture, deployment processes, and observability practices. Elevating the customer experience as the ultimate benchmark of our reliability standards. Sharing industry best practices in SRE, ensuring our team remains at the … cloud networking, microservices architecture, and Amazon EKS. Preferred qualifications include: Prior involvement in the Fintech sector or other regulated industries. Familiarity with the Grafana observability stack. Experience in Chaos Engineering methodologies. Your expertise will be instrumental in fortifying our infrastructure and delivering exceptional reliability to our customers. About Convera Convera More ❯
while maintaining robust quality controls. Maintaining and enhancing our Terraform configurations, ensuring they remain modular, reusable, and well-documented. Building a comprehensive monitoring and observability system that provides real-time visibility into platform health, performance metrics, and potential issues. Supporting customers who self-host the Rainbird platform within their own … should have practical experience implementing comprehensive monitoring and logging solutions using tools like AWS CloudWatch. This includes designing dashboards, setting up alerts, and creating observability systems that provide actionable insights. A thorough understanding of cloud security principles is critical, including identity and access management, network security, encryption practices, and compliance More ❯
DevOps Engineer (Kubernetes Linux AWS) London/WFH to £70k Are you a technologist DevOps Engineer with a deep knowledge of Kubernetes seeking an opportunity to take ownership and make an impact whilst working on a modern stack with continual More ❯
that matters-it's how we do it. DRW is a place of high expectations, integrity, innovation and a willingness to challenge consensus. Our Observability team provides mission critical support for many of our centralized logging, metrics and tracing tools used throughout the firm. They manage the deployment and administration … practices Interact with vendor support to debug and drive third-party issues to resolution Interface with other teams to be an ambassador of good observability practices Help teams identify data to ingest and how to make use of this data through dashboards and alerting Required Experience: 5+ years of industry … Ability to work well on a team as well as independently What will make you stand out: Experience using Splunk, Grafana, Prometheus and other observability tools Experience using kubernetes to deploy and maintain systems Experience using Jsonnet or other templating tools to render complex yaml/json Familiarity with gitops More ❯
CD pipeline management using Azure DevOps, including YAML pipelines. Experience in writing Infrastructure as Code (IaC) with Terraform or similar tools. Monitoring, Logging, and Observability: Proficient with monitoring, logging, and observability tools such as Azure Monitor, Application Insights, and Log Analytics. Cloud Infrastructure & .NET Applications: Strong understanding of how cloud More ❯
CD pipeline management using Azure DevOps, including YAML pipelines. Experience in writing Infrastructure as Code (IaC) with Terraform or similar tools. Monitoring, Logging, and Observability: Proficient with monitoring, logging, and observability tools such as Azure Monitor, Application Insights, and Log Analytics. Cloud Infrastructure & .NET Applications: Strong understanding of how cloud More ❯
APIs, implementing Software Reliability Engineering (SRE) best practices, and closely collaborating with existing teams to develop new software solutions. The team will enhance resilience, observability, incident management, and disaster recovery (DR) practices while supporting the Peri Pantry and Stock Management teams, as well as the Accounting, Banking, and Property (ABP … to deployment, operation, and refinement Partner with multiple product teams to provide support and technical leadership Develop backend services and interfaces, focusing on scalability, observability, and performance Maintain and optimize services post-launch, measuring availability, latency, and system health Implement resilience and automation strategies to enhance system reliability. Collaborate with … party vendors, ensuring efficient API integrations and exploring the best technical solutions. Assist with incident management and postmortem reviews, embedding best practices across teams. Observability & Monitoring: Improve alerting and logging with structured logs, distributed tracing, and metric-driven monitoring. Ensure system health checks and real-time insights into failures. Scalability More ❯
APIs, implementing Software Reliability Engineering (SRE) best practices, and closely collaborating with existing teams to develop new software solutions. The team will enhance resilience, observability, incident management, and disaster recovery (DR) practices while supporting the Peri Pantry and Stock Management teams, as well as the Accounting, Banking, and Property (ABP … to deployment, operation, and refinement Partner with multiple product teams to provide support and technical leadership Develop backend services and interfaces, focusing on scalability, observability, and performance Maintain and optimize services post-launch, measuring availability, latency, and system health Implement resilience and automation strategies to enhance system reliability. Collaborate with … party vendors, ensuring efficient API integrations and exploring the best technical solutions. Assist with incident management and postmortem reviews, embedding best practices across teams. Observability & Monitoring: Improve alerting and logging with structured logs, distributed tracing, and metric-driven monitoring. Ensure system health checks and real-time insights into failures. Scalability More ❯
knowledge of AWS services, including ECS, Kinesis, DynamoDB, and Lambda Proficiency in CI/CD tools, particularly Jenkins and Spinnaker Familiarity with monitoring and observability tools such as CloudWatch and Datadog Strong understanding of security best practices in cloud environments Preferred Qualifications In addition to the required qualifications, the following … experiences are highly desirable: Experience with event-driven architecture and design patterns Knowledge of the Kubernetes ecosystem, specifically AWS EKS Proficiency with OpenTelemetry for observability Previous experience mentoring and guiding junior team members The Walt Disney Company is an Equal Opportunity Employer. We strive to be a diverse workforce that More ❯
user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery. What we expect from you Drive the development of a comprehensive … QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive … and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Just Eat Takeaway.com
of restaurant, grocery and convenience partners across the globe. About the role: Just Eat Takeaway is seeking an aspiring Engineer to join the Platform Observability team. The team sits within the Platform & Reliability department, which exists to provide global engineering a magnifying glass into their services while driving commercial availability … and optimization. The team is responsible for looking after a wide range of Observability capabilities that underpin our global platforms. As a Platform Engineer, you will support the implementation and continual evolution of these areas, following guidance from senior engineers within the department. In this role, you will be expected More ❯
Job responsibilities: * Guides and assists others in the areas of building appropriate level designs and gaining consensus from peers where appropriate * Experience with Technical Observability - Dynatrace or Datadog tool would be very useful. Also, expertise with distributed traces would be beneficial. * Collaborates with other software engineers and teams to design … Boot, and .Net * Proficient knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.) * Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others More ❯
Cloud Automation & Tooling (SAT) Team drives automation, security, and compliance for Sovereign Cloud across AWS, Azure, and OpenStack, leveraging IaC, CI/CD, and observability and develops Operations Control Plane (OCP) which orchestrates provisioning, monitoring, and lifecycle management, integrating with our SAP internal tools like SPC, CRM, and cloud automation … environment. Preferred Qualifications Knowledge of Sovereign Cloud compliance and regulatory documentation requirements. Experience documenting tools related to Infrastructure as Code (IaC), CI/CD, observability, or cloud automation. Background in computer science, engineering, or a related technical field. Join us to make a difference by delivering world-class technical documentation More ❯
learning and growth, while participating in hiring processes and training engineers up to Staff standard. Operational Stability: Demonstrate a production first attitude, continuously considering observability and maintaining Service Level Objectives, while delivering change at pace. Research & Innovation: Embrace emerging technologies and trends, and share insights with the organisation, while developing … with Kafka CI/CD with GitHub Actions and Azure pipelines Code quality with Sonar Microservice architecture Azure DevOps, Kubernetes, Docker Azure storage, Redis Observability Tools Dynatrace, New Relic Git, GitHub TDD, BDD Kotlin, .NET Android development Reporting built with MS SSRS and PowerBI Security and performance testing and optimisation More ❯
Cloud Automation & Tooling (SAT) Team drives automation, security, and compliance for Sovereign Cloud across AWS, Azure, and OpenStack, leveraging IaC, CI/CD, and observability and develops Operations Control Plane (OCP) which orchestrates provisioning, monitoring, and lifecycle management, integrating with our SAP internal tools like SPC, CRM, and cloud automation … environment. Preferred Qualifications Knowledge of Sovereign Cloud compliance and regulatory documentation requirements. Experience documenting tools related to Infrastructure as Code (IaC), CI/CD, observability, or cloud automation. Background in computer science, engineering, or a related technical field. A central role in enabling usability and adoption of cutting-edge cloud More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Duel
executed effectively. Run effective planning rituals including regular sprint planning, standups, and retrospectives. Build and evolve the platform foundations-from shared services to security, observability, and tooling-that enable product teams to move fast and stay safe. Drive platform improvements to support scale, availability, and performance as our business grows. … engineering experience, ideally with TypeScript in production environments. You're comfortable working with datastores like MongoDB (Atlas), ElasticSearch, and Snowflake. You understand distributed systems, observability best practices, and modern CI/CD workflows. You've built internal platforms or reusable services that support multiple teams or squads. You have a More ❯
leeds, west yorkshire, yorkshire and the humber, United Kingdom
Accenture
Please Note: Any offer of employment is subject to satisfactory BPSS and SC security clearance which requires 5 years continuous UK address history (typically including no periods of 30 consecutive days or more spent outside of the UK). Accenture More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
William Hill PLC
Our team is building the next generation Sports Betting platform that optimizes flexibility, performance, responsiveness and resiliency. The technologies we like to use include Java, SpringBoot, Kafka, Cassandra, Postgres, Kubernetes, AWS, Postgres, etc. We are looking for an experienced Java More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability … maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles … including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and More ❯