one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
automation) Clear understanding of Infrastructure as Code Experience with Helm/Kustomize/Kapitan or similar deployment tools Experience with Grafana/Prometheus/Splunk or similar monitoring/observability technologies Ability to independently deliver solutions and work effectively in cross-functional teams Start date is ASAP for the DevOps Engineer The Senior DevOps Engineer will be responsible for: Working More ❯
Ensure APIs are optimized for performance, scalability, and security. Implement API authentication, authorization, and security protocols (OAuth, JWT, Mutual TLS, etc.). Monitor API performance and troubleshoot issues using observability tools. Design and implement API versioning, rate limiting, and throttling policies. Provide mentorship to developers and drive API-first design principles. Evaluate new API technologies and trends to continuously improve More ❯
NW10, Middlesex, Greater London, United Kingdom Hybrid / WFH Options
ITH Pharma
automatically deploy updates and fixes into the production environment. Maintenance, troubleshooting: Perform routine application maintenance to ensure the production environment runs smoothly. Develops maintenance requirements and procedures. Monitoring and Observability: Monitors servers, applications and clusters for failures, system crashes and resource usage, etc using tools like Prometheus, Grafana or Elastic Stack (Elastic Search, Logstash and Kibana). FURTHER DUTIES WILL More ❯
and team collaboration Working knowledge of databases and SQL Strong capability with infrastructure with AWS, Kubernetes & Terraform Good understanding of DNS, CI/CD workflows and infrastructure as code Observability experience (e.g., Prometheus, Grafana) Development experience with object-oriented programming (e.g., Java) Diligence, quality-focused, and analytical skills Proactive in contributing to organisational success More ❯
a bias for Infrastructure (Python, Go, C#) • IAM Policy and Authentication/Authorization schemes • Web Services and REST API • Databases and Storage Systems • Development Build, Test, and Deployment Pipelines • Observability and Monitoring (Open Telemetry, TIG and ELK stacks) #LI-JS2 Together, as owners, let’s turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect and More ❯
upgrades, ensuring comprehensive testing across third-party and custom-built applications. Establish Advanced Performance Engineering: Establish a robust performance engineering strategy, integrating advanced tools for application performance monitoring (APM), observability, and telemetry. Focus on early identification of performance bottlenecks and quality assurance measures tailored for large-scale enterprise systems, ensuring seamless functionality across platforms. Collaborate Across Cross-Functional Teams/ More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
london (city of london), south east england, united kingdom
rmg digital
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
skills/knowledge/experience: Hands-on experience with AWS services at the DevOps Engineer level. Prior experience managing incidents, changes, and problem-solving processes. Strong background in enterprise observability tooling, specifically Prometheus, Grafana, and Splunk, with experience in PromQL. Proficiency in one or more programming languages such as Python, Go, Bash, or SQL. Familiarity with GitHub, GitOps, container orchestration More ❯
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Computer Futures
across mission-critical systems. Your mission includes: Architecting and maintaining AWS cloud environments Managing Kubernetes clusters (plus Docker & Helm) Building CI/CD pipelines and automated deployment tools Driving observability with tools like CloudWatch, ELK, and Grafana Mentoring junior engineers and shaping DevOps best practices Ensuring security, compliance, and disaster recovery readiness What You Bring You're a tech-savvy More ❯
automation scripts, infrastructure as code, creating tooling or frameworks and feature development, ideally using Java and/or python. • Experience of engineering enablement products such as CI/CD, Observability and Alerting • Experience creating designs and documentation, including ‘how to user guides’ • Experience of investigating and resolving incidents and problems aligned to the SLAs • Continuously seeking opportunities for system performance More ❯
UI. Exposure to micro frontend architecture (e.g., Module Federation or Single-SPA). Experience with cloud-native DevOps tooling: Docker, Kubernetes, AWS/GCP deployments. Proficiency in analytics and observability tools like Sentry, Datadog, or LogRocket. Soft Skills Strategic thinker with strong problem-solving and decision-making skills. Ability to work in fast-paced, agile environments with cross-functional teams. More ❯
FastAPI and SQLAlchemy for building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries More ❯
availability, fault tolerance, and cost efficiency Lead troubleshooting efforts and mentor junior engineers Skills & Experience: Strong AWS expertise and DevOps practices Infrastructure as Code (Terraform) Configuration management and automation Observability tools (CloudWatch, ELK, Grafana) CI/CD pipeline design and management Scripting/programming skills (any language) My client offers the innovation and agility of a smaller team with the More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Sanderson
with IAM engineering experience across authentication, authorisation, single sign-on, multi-factor authentication, identity lifecycle management, OAuth2.0, OpenID Connect, SAML and policy management Knowledge of Site Reliability Engineering, automation, observability, incident management, resilience, disaster recovery, high availability, documentation IAM engineering experience, authentication, authorisation, single sign-on, multi-factor authentication, user lifecycle management, hands on CI/CD approaches and technologies More ❯
Employment Type: Full-Time
Salary: £100,000 - £135,000 per annum, Inc benefits
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
london (city of london), south east england, united kingdom
HCLTech
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define More ❯
and Knative to ensure an effective and responsive infrastructure. Infrastructure as a Code (IAAC): Using Terraform and Terragrunt to manage resources across environments. Monitoring Tools: Prometheus, Grafana, and other observability tools to ensure system health and visibility. What You'll Gain Purpose: Use your skills to contribute to global sustainability goals through innovative technology. Collaboration: Be part of a leadership More ❯