such as Kubernetes, Docker Swarm, or HashiCorp Nomad. Excellent problem-solving, communication, and collaboration skills. Nice to have: Experience managing distributed systems, microservices, and event-driven architectures. Knowledge of observability tools such as Prometheus, Grafana, ELK Stack, or Datadog. Experience with security best practices, monitoring, and incident response. Familiarity with DevSecOps and compliance frameworks (ISO 27001, SOC 2, GDPR). More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
The Granite Group
with GitOps tools (e.g., ArgoCD, Flux). CI/CD - Skilled in building and managing pipelines using Azure DevOps, GitHub Actions, etc. Monitoring - Experience with Prometheus, Grafana, and other observability tools. Application Stack - Familiarity with .NET, Node.js, React, and web server technologies like Nginx. Relevant certifications or the ability to demonstrate equivalent experience, such as: Terraform Associate About Acorn Insurance More ❯
Python (or other language), Bash/Shell, YAML including any Development frameworks Extensive experience and in-depth knowledge of the Linux operating system for effective troubleshooting activities Experience with Observability tools like Grafana, Prometheus, ELK, OCI Observability We highly value ownership and initiative with capabilities to drive projects independently Dealing with changes on a daily basis in a very dynamic More ❯
one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
WAF, CloudFront, API GW, AWS Organizations, S3, ECS, EKS, Route 53, ELBs, OpenShift, Kubernetes, Docker Languages: TypeScript, Python Security & Scanning: AWS Guardrails, Checkov, Prisma Cloud, OSV Scanner, SonarQube, Renovate Observability & Logging: CloudWatch, OpenSearch Operating System Management: RedHat Satellite, AMI lifecycle management, Ubuntu Landscape Testing Tools: Pytest, Jest, Cypress APIs/Microservices: RESTful APIs, API Gateway, containerised services Version Control: GitLab … to as Senior DevOps Engineer. On a typical day you will Design, build, and maintain scalable, high-quality software and platform systems Implement and manage CI/CD pipelines, observability, security automation, automated testing, and engineering standards Lead feature development from concept to production with focus on quality and performance Troubleshoot issues, ensuring resilience, reliability, and minimal user disruption Contribute More ❯
to L3 networking Programming languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Hosting technologies such as IIS, nginx, Apache, App Service, LightSail Analytical and creative approach to problem More ❯
operating system for effective troubleshooting activities Awareness of any cloud infrastructure principles (like AWS, GCP or OCI), understanding basic principles of secure software delivery is a plus Familiar with Observability tools like Grafana or Prometheus, understanding the importance of giving the correct visibility to our platforms and environments We highly value ownership and initiative with capabilities to drive projects independently More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
Robert Walters
of cloud infrastructure and applications on Google Cloud Platform. You will work collaboratively with engineering and infrastructure teams to implement site reliability engineering (SRE) principles, focusing on system reliability, observability, automation, and operational excellence. This role follows a hybrid working model, requiring attendance at the Bristol office for at least two days per week or 40% of the working time. … objectives (SLOs), indicators (SLIs), and monitoring practices Hands-on experience with infrastructure as code (e.g., Terraform) and CI/CD tools (e.g., Jenkins, Azure DevOps) Desirable Knowledge Familiarity with observability and performance tools such as Dynatrace, Stackdriver, Cloud Monitoring, or similar Exposure to cost monitoring, logging frameworks, and cloud consumption analytics Personal Attributes Ability to mentor and support engineers in More ❯
Washington, Washington DC, United States Hybrid / WFH Options
ClearanceJobs
and cloud security best practices. • Proficiency in Kubernetes, Docker, and container orchestration. • Knowledge of Linux system administration and scripting (Python, Bash). • Experience with logging, monitoring, and observability tools in a cloud-native environment. • Strong troubleshooting, problem-solving, and automation mindset. Responsibilities/Impact as a SRE: • AWS GovCloud Operations: Manage and optimize cloud-based infrastructure in AWS GovCloud, ensuring … FedRAMP compliance and high availability. • Reliability & Performance: Monitor and enhance system performance, scalability, and reliability through observability tools, automation, and best practices. • Security & Compliance: Implement and maintain security controls aligned with FedRAMP, NIST 800-53, and other federal cybersecurity standards. • Infrastructure as Code (IaC): Develop and manage infrastructure automation using Terraform and Ansible. • CI/CD & Automation: Enhance DevSecOps pipelines More ❯
access to the best tools available. We combine problem-solving skills with software and systems engineering to take a proactive approach in building fault-tolerant and secure systems, improving observability and zealously automating away toil. In this role you will: Use your site reliability expertise to design, operate and support Preqin's infrastructure, middleware and internal services. Improving their performance More ❯
pipelines Drive platform modernisation Manage a small team of engineers Align DevOps capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Lead DevOps Engineer Requirements Proven line management experience Cloud-native expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end to end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Understanding of Microservices & principles of RESTful API development, including structuring, documenting, versioning, testing and stubbing/ More ❯
IT workflows. Your responsibilities will also include developing CI/CD pipelines tailored for IT infrastructure, enhancing deployment efficiency, and integrating robust network security measures. You will establish comprehensive observability and proactive issue resolution strategies. We are seeking individuals passionate about network automation, security, and scalable IT solutions that enhance both campus and cloud network operations. You should possess extensive More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Tuesdays, Thursdays WFH) Pay: negotiable, inside IR35 We're looking for an experienced DevOps Engineer to join our team on a contract basis, with a focus on AWS infrastructure, observability tooling, and CI/CD automation. This is a hands-on role supporting high-availability systems, rapid deployments, and production incident response. Key Responsibilities - Manage and monitor AWS infrastructure for … performance and security - Respond to production incidents, perform root cause analysis, and implement fixes - Maintain observability tools (Prometheus, Grafana, Splunk) and write PromQL queries - Improve and operate CI/CD pipelines using GitHub Actions and Kubernetes - Automate infrastructure tasks with Python, Bash, Go or SQL - Work with Git-based workflows for infrastructure as code - Troubleshoot Kubernetes workloads and containerised services More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
collaborate across teams to: Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and … TypeScript, Python). Validated experience operating distributed systems at scale in production. Cloud AWS (primary), Kubernetes (future), Docker (current), Terraform. Excellent debugging skills across network, systems, and data stack. Observability tooling, e.g. custom metrics pipelines, OpenTelemetry tracing, or integrations across telemetry stacks. Security engineering and practical understanding of IAM hardening, zero-trust network principles, and secrets management in data-heavy More ❯
teams to build cost-effective solutions on GCP while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a FinOps culture. What you'll do Partner with engineering, finance and product teams to drive cost-efficiency across GCP Design and implement automation to boost cost optimisation … had GCP certifications (e.g. Professional Cloud DevOps Engineer, Professional Cloud Architect) FinOps Foundation certifications (e.g. Practitioner, Engineer) Familiarity with security tools e.g. Hashicorp Vault, Aquasec, Nexus IQ. Knowledge of observability tools e.g. Dynatrace. Experience in cost management tools e.g. Cloudability. About working for us Our focus is to ensure we're inclusive every day, building an organisation that reflects modern More ❯
/IBM MQ). DevOps Principles: Understanding of DevOps principles and infrastructure as code tools (i.e., Terraform). Performance Tuning: Background in performance tuning, profiling, and monitoring Java applications. Observability and Monitoring: Solid experience with Observability and Monitoring tools (i.e., Splunk/Dynatrace). Leadership and Mentoring: Experience mentoring junior developers or leading small engineering teams. About working for us More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Frontier Resourcing
as Code (IaC) using tools like Terraform or CloudFormation Implement and manage CI/CD pipelines, enabling rapid and reliable deployments Monitor systems for performance, availability, and security, using observability best practices Collaborate in Agile development teams, contributing to sprints, stand-ups, and continuous delivery cycles Troubleshoot infrastructure and deployment issues, delivering fast and sustainable solutions Ensure compliance with industry More ❯
mentoring engineers and collaborating with stakeholders. Proven ability to resolve technical incidents in unfamiliar production systems. Technical and process documentation champion. Experience of operationally managing production software components, including observability, logging, metrics, error reporting, debugging, and live incident management. Your time will be spent roughly as follows: 60% - Proactive technical work (e.g. migrating DB hosting provider, new message bus system More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯