authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Knowledge of Site Reliability Engineering, automation, observability, incident management, resilience, disaster recovery, high availability, documentation IAM engineering experience, authentication, authorisation, single sign-on, multi-factor authentication, user lifecycle management, hands on CI More ❯
automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such More ❯
london, south east england, united kingdom Hybrid / WFH Options
Premier Group
automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such More ❯
Romsey, Hampshire, South East, United Kingdom Hybrid / WFH Options
Robert Half
practices What You'll Be Doing Infrastructure as Code - Manage, optimize, and automate cloud environments Security & Reliability - Implement best practices for performance, security, and observability Collaboration - Work closely with stakeholders to streamline deployments and CI/CD pipelines Troubleshooting & Monitoring - Ensure high availability and efficiency of systems Must-Have Skills More ❯
based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform More ❯
london, south east england, united kingdom Hybrid / WFH Options
RedCat Digital
based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform More ❯
streaming technologies like Spark Structured Streaming or Apache Flink. Additional programming skills in PowerShell or Bash. Understanding of Databricks Ecosystem components. Familiarity with Data Observability or Data Quality Frameworks. More ❯
with a cloud provider (AWS/Azure/GCE), or sysadmin/SRE experience in data centers Experience designing, building, and operating high-scale observability or infrastructure systems Working knowledge of networking fundamentals, experience with CNIs or cloud networking infrastructure preferred What We Require 4+ years of professional software development More ❯
the best possible solutions for public and private cloud environments and develop infrastructure technology to comply with security, resilience, sustainability, and operational requirements with observability and guardrails built in You’ll also champion and drive the use of automation to provide testing and a route to live for the product More ❯
the best possible solutions for public and private cloud environments and develop infrastructure technology to comply with security, resilience, sustainability, and operational requirements with observability and guardrails built in You’ll also champion and drive the use of automation to provide testing and a route to live for the product More ❯
configuration management. Experience with cloud infrastructure and managing tools for cloud services (e.g., AWS Lambda, EC2, Kubernetes). Hands-on experience with monitoring and observability tools like Prometheus, Grafana, or ELK Stack for tracking tool performance. Knowledge of security best practices in managing tool configurations, particularly for CI/CD More ❯
Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. Disaster recovery and security tooling – ensure platform resilience and safe More ❯
for fast responses. • Design fault-tolerant and resilient distributed systems using Kubernetes and cloud-native technologies. • Utilize Prometheus, Grafana, and Kibana for monitoring and observability of backend systems. • Optimize API performance and response times for a seamless user experience. Data Analytics & User Insights: • Integrate real-time data processing and analytics More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Prism Digital
Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. Disaster recovery and security tooling – ensure platform resilience and safe More ❯
london (hounslow), south east england, united kingdom
eTeam
for fast responses. • Design fault-tolerant and resilient distributed systems using Kubernetes and cloud-native technologies. • Utilize Prometheus, Grafana, and Kibana for monitoring and observability of backend systems. • Optimize API performance and response times for a seamless user experience. Data Analytics & User Insights: • Integrate real-time data processing and analytics More ❯
securely, recover quickly, and move with confidence. You’ll build the systems that make rapid iteration safe — from deployment workflows and rollout strategies to observability, alerting, and incident response. You'll be actively exploring innovative approaches to automation, deployment, and environment management — things that give our client leverage, speed, and More ❯
securely, recover quickly, and move with confidence. You’ll build the systems that make rapid iteration safe — from deployment workflows and rollout strategies to observability, alerting, and incident response. You'll be actively exploring innovative approaches to automation, deployment, and environment management — things that give our client leverage, speed, and More ❯
of working with Kubernetes and Cloud Platforms (AWS, GCP or Azure). Expertise in one or more of the following areas: Database Administration, Networking, Observability Tools, or automation of infrastructure. Ability to tackle design and functionality problems independently with little to no oversight. Excellent debugging and troubleshooting skills. Preferred qualifications More ❯
spanning from foundational OS networking layers to cloud provider configurations. Proven experience in leading projects within security-focused areas, such as runtime scanning, security observability, CSPM, and more Cloud Expertise: Strong experience with at least one cloud platform (AWS, Azure, GCP), including expertise in IAM, VPC networking, security groups, and More ❯
influencing technical decisions across the different stakeholder levels of the business including non-technical audiences. Ability to foster a culture around data-driven reliability, observability, monitoring, and automation. Due to the global nature of the team, a degree of flexible working will be required to accommodate different time zones. We More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
William Hill PLC
functional requirements into working software alongside your team Collaborate with the team to analyze, debug, and resolve defects Demonstrate a commitment to monitoring and observability Manage technical debt effectively by avoiding its creation and removing it when possible Communicate clearly, translating technical and non-technical requirements as needed Understand timelines More ❯
impact on operations. Participate in a support on-call schedule. What We Value Confidence in troubleshooting complex systems issues independently using stack traces and observability & systems tools. Comfort with managing large scale production systems and technologies with configuration management, load balancing, monitoring & alerting infrastructure, and container orchestration. Ability to work More ❯
delivering software features into production, ideally in a B2B SaaS or data-rich environment. Dedicated to driving best practise within the SDLC, including quality, observability, CI/CD, SOLID and Design Patterns. Strong background in software engineering with hands-on experience in developing, evaluating, and deploying complex systems. Proficiency with More ❯
influencing technical decisions across the different stakeholder levels of the business including non-technical audiences. Ability to foster a culture around data-driven reliability, observability, monitoring, and automation. Due to the global nature of the team, a degree of flexible working will be required to accommodate different time zones. We More ❯
and collaborate with Product, Data, and Artist Relations to translate business goals into resilient software. 5% Champion DevEx, proposing improvements to CI/CD, observability, and performance. You'll be successful here if you have 7+ years professional experience (at least 3 in a senior/lead capacity) delivering production More ❯