one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
using Kubernetes, including Amazon EKS (Elastic Kubernetes Service) and Azure Kubernetes Service (AKS), ensuring their reliability, availability, and performance. Monitoring and Alerting: Monitor application performance and system health through observability tools (e.g., Prometheus, Grafana, ELK stack), proactively identifying and resolving issues to ensure high availability and rapid incident response. Security and IAM: Implement security best practices, managing Identity and Access More ❯
automation) Clear understanding of Infrastructure as Code Experience with Helm/Kustomize/Kapitan or similar deployment tools Experience with Grafana/Prometheus/Splunk or similar monitoring/observability technologies Ability to independently deliver solutions and work effectively in cross-functional teams Start date is ASAP for the DevOps Engineer The Senior DevOps Engineer will be responsible for: Working More ❯
Ensure APIs are optimized for performance, scalability, and security. Implement API authentication, authorization, and security protocols (OAuth, JWT, Mutual TLS, etc.). Monitor API performance and troubleshoot issues using observability tools. Design and implement API versioning, rate limiting, and throttling policies. Provide mentorship to developers and drive API-first design principles. Evaluate new API technologies and trends to continuously improve More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Charles Simon Associates Ltd
with CI/CD pipelines and tooling. Solid knowledge of Linux, networking, and cloud security best practices. Experience with scripting languages (Python, Bash, Go). Knowledge of monitoring/observability tools (Prometheus, Grafana, Stackdriver). Nice to Have: Google Professional Cloud DevOps Engineer or Architect certification. Experience with multi-cloud or hybrid cloud environments. Knowledge of service mesh technologies (Istio More ❯
NW10, Middlesex, Greater London, United Kingdom Hybrid / WFH Options
ITH Pharma
automatically deploy updates and fixes into the production environment. Maintenance, troubleshooting: Perform routine application maintenance to ensure the production environment runs smoothly. Develops maintenance requirements and procedures. Monitoring and Observability: Monitors servers, applications and clusters for failures, system crashes and resource usage, etc using tools like Prometheus, Grafana or Elastic Stack (Elastic Search, Logstash and Kibana). FURTHER DUTIES WILL More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
london (city of london), south east england, united kingdom
rmg digital
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
security. Collaborating with cross-functional teams to design and implement high-quality cloud solutions. Administering and supporting Databricks environments, including permissions, storage, and networking. Troubleshooting complex technical issues using observability tools and root-cause analysis. Implementing infrastructure management best practices and automating repetitive tasks. Supporting program installations, system configurations, and user modifications. Refining system monitoring and reporting in collaboration with More ❯
upgrades, ensuring comprehensive testing across third-party and custom-built applications. Establish Advanced Performance Engineering: Establish a robust performance engineering strategy, integrating advanced tools for application performance monitoring (APM), observability, and telemetry. Focus on early identification of performance bottlenecks and quality assurance measures tailored for large-scale enterprise systems, ensuring seamless functionality across platforms. Collaborate Across Cross-Functional Teams/ More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Robert Half
Java, Python, or similar). Hands-on experience building and maintaining CI/CD pipelines (Azure DevOps, GitHub Actions, Jenkins, or similar). Strong understanding of monitoring, logging, and observability tools (e.g., AppInsights, ELK, Prometheus, Grafana). Solid knowledge of test-driven development and experience embedding TDD in automated delivery workflows. Experience working directly within software development teams to support More ❯
and team collaboration Working knowledge of databases and SQL Strong capability with infrastructure with AWS, Kubernetes & Terraform Good understanding of DNS, CI/CD workflows and infrastructure as code Observability experience (e.g., Prometheus, Grafana) Development experience with object-oriented programming (e.g., Java) Diligence, quality-focused, and analytical skills Proactive in contributing to organisational success More ❯
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Computer Futures
across mission-critical systems. Your mission includes: Architecting and maintaining AWS cloud environments Managing Kubernetes clusters (plus Docker & Helm) Building CI/CD pipelines and automated deployment tools Driving observability with tools like CloudWatch, ELK, and Grafana Mentoring junior engineers and shaping DevOps best practices Ensuring security, compliance, and disaster recovery readiness What You Bring You're a tech-savvy More ❯
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
london (city of london), south east england, united kingdom
HCLTech
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
UI. Exposure to micro frontend architecture (e.g., Module Federation or Single-SPA). Experience with cloud-native DevOps tooling: Docker, Kubernetes, AWS/GCP deployments. Proficiency in analytics and observability tools like Sentry, Datadog, or LogRocket. Soft Skills Strategic thinker with strong problem-solving and decision-making skills. Ability to work in fast-paced, agile environments with cross-functional teams. More ❯
FastAPI and SQLAlchemy for building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries More ❯
availability, fault tolerance, and cost efficiency Lead troubleshooting efforts and mentor junior engineers Skills & Experience: Strong AWS expertise and DevOps practices Infrastructure as Code (Terraform) Configuration management and automation Observability tools (CloudWatch, ELK, Grafana) CI/CD pipeline design and management Scripting/programming skills (any language) My client offers the innovation and agility of a smaller team with the More ❯