with CI/CD tools like Jenkins and GitLab CI. Solid understanding of containerisation and orchestration technologies, such as Kubernetes. Experience with monitoring and observability tools (e.g., Grafana, Prometheus) for maintaining infrastructure health and performance. Package: £130,000 - £140,000 Basic Salary Remote Working Options Bonus up to 25% Excellent More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Intec Select
with CI/CD tools like Jenkins and GitLab CI. Solid understanding of containerisation and orchestration technologies, such as Kubernetes. Experience with monitoring and observability tools (e.g., Grafana, Prometheus) for maintaining infrastructure health and performance. Package: £130,000 - £140,000 Basic Salary Remote Working Options Bonus up to 25% Excellent More ❯
authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Knowledge of Site Reliability Engineering, automation, observability, incident management, resilience, disaster recovery, high availability, documentation IAM engineering experience, authentication, authorisation, single sign-on, multi-factor authentication, user lifecycle management, hands on CI More ❯
Help manage OS/kernel compatibility for performance-critical apps. Guide developers on infrastructure choices and ensure consistency across environments. Contribute to performance tuning, observability improvements, and tooling evolution. What helps: Solid grounding in infrastructure as code especially with Terraform and Ansible. Experience with AWS and Kubernetes, or the curiosity More ❯
Help manage OS/kernel compatibility for performance-critical apps. Guide developers on infrastructure choices and ensure consistency across environments. Contribute to performance tuning, observability improvements, and tooling evolution. What helps: Solid grounding in infrastructure as code especially with Terraform and Ansible. Experience with AWS and Kubernetes, or the curiosity More ❯
automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Premier Group
automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such More ❯
techniques to automate processes and solve challenging business problems Maintain quality, security, reliability, and compliance of all solutions through digital best practices Build robust observability into solutions and monitor production health Advocate for client needs and deliver solutions that exceed expectations Establish and share best practices and methodologies across the More ❯
techniques to automate processes and solve challenging business problems Maintain quality, security, reliability, and compliance of all solutions through digital best practices Build robust observability into solutions and monitor production health Advocate for client needs and deliver solutions that exceed expectations Establish and share best practices and methodologies across the More ❯
based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform More ❯
london, south east england, United Kingdom Hybrid / WFH Options
RedCat Digital
based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform More ❯
Work closely with analysts, data scientists, and business stakeholders to align data systems with evolving needs Promote engineering best practices around version control, testing, observability, and documentation Guide improvements to data quality, reliability, and governance through policy and tooling Stay current with emerging technologies and make informed recommendations to modernize More ❯
Work closely with analysts, data scientists, and business stakeholders to align data systems with evolving needs Promote engineering best practices around version control, testing, observability, and documentation Guide improvements to data quality, reliability, and governance through policy and tooling Stay current with emerging technologies and make informed recommendations to modernize More ❯
with a cloud provider (AWS/Azure/GCE), or sysadmin/SRE experience in data centers Experience designing, building, and operating high-scale observability or infrastructure systems Working knowledge of networking fundamentals, experience with CNIs or cloud networking infrastructure preferred What We Require 4+ years of professional software development More ❯
with MLOps tools such as MLflow, DVC, Kubeflow, Docker/Kubernetes, and GitOps practices Strong working knowledge of Azure and Databricks services Proficient with observability and monitoring tools (e.g. Prometheus, Grafana, Datadog) Curious and commercially minded — focused on delivering scalable, valuable solutions Familiarity with additional cloud platforms such as AWS More ❯
with MLOps tools such as MLflow, DVC, Kubeflow, Docker/Kubernetes, and GitOps practices Strong working knowledge of Azure and Databricks services Proficient with observability and monitoring tools (e.g. Prometheus, Grafana, Datadog) Curious and commercially minded — focused on delivering scalable, valuable solutions Familiarity with additional cloud platforms such as AWS More ❯
configuration management. Experience with cloud infrastructure and managing tools for cloud services (e.g., AWS Lambda, EC2, Kubernetes). Hands-on experience with monitoring and observability tools like Prometheus, Grafana, or ELK Stack for tracking tool performance. Knowledge of security best practices in managing tool configurations, particularly for CI/CD More ❯
for fast responses. Design fault-tolerant and resilient distributed systems using Kubernetes and cloud-native technologies. Utilize Prometheus, Grafana, and Kibana for monitoring and observability of backend systems. Optimize API performance and response times for a seamless user experience. Data Analytics & User Insights Integrate real-time data processing and analytics More ❯
london, south east england, United Kingdom Hybrid / WFH Options
eTeam
for fast responses. Design fault-tolerant and resilient distributed systems using Kubernetes and cloud-native technologies. Utilize Prometheus, Grafana, and Kibana for monitoring and observability of backend systems. Optimize API performance and response times for a seamless user experience. Data Analytics & User Insights Integrate real-time data processing and analytics More ❯
Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. Disaster recovery and security tooling – ensure platform resilience and safe More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Prism Digital
Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. Disaster recovery and security tooling – ensure platform resilience and safe More ❯
securely, recover quickly, and move with confidence. You’ll build the systems that make rapid iteration safe — from deployment workflows and rollout strategies to observability, alerting, and incident response. You'll be actively exploring innovative approaches to automation, deployment, and environment management — things that give our client leverage, speed, and More ❯
securely, recover quickly, and move with confidence. You’ll build the systems that make rapid iteration safe — from deployment workflows and rollout strategies to observability, alerting, and incident response. You'll be actively exploring innovative approaches to automation, deployment, and environment management — things that give our client leverage, speed, and More ❯
of working with Kubernetes and Cloud Platforms (AWS, GCP or Azure). Expertise in one or more of the following areas: Database Administration, Networking, Observability Tools, or automation of infrastructure. Ability to tackle design and functionality problems independently with little to no oversight. Excellent debugging and troubleshooting skills. Preferred qualifications More ❯
spanning from foundational OS networking layers to cloud provider configurations. Proven experience in leading projects within security-focused areas, such as runtime scanning, security observability, CSPM, and more Cloud Expertise: Strong experience with at least one cloud platform (AWS, Azure, GCP), including expertise in IAM, VPC networking, security groups, and More ❯