in DevOps, cloud infrastructure, and automation. Strong knowledge of CI/CD tooling, IaC, and cloud-native technologies. Advanced scripting (Bash, Python) and automation experience. Skilled in monitoring and observability tools (e.g., Prometheus, Grafana, ELK). Strong problem-solving, communication, and leadership skills. Familiarity and Experience of CI/CD Tools: Jenkins, GitLab CI Infrastructure as Code: Terraform, Ansible, Helm More ❯
one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
Acorn Insurance
with GitOps tools (e.g., ArgoCD, Flux). CI/CD - Skilled in building and managing pipelines using Azure DevOps, GitHub Actions, etc. Monitoring - Experience with Prometheus, Grafana, and other observability tools. Application Stack - Familiarity with .NET, Node.js, React, and web server technologies like Nginx. Relevant certifications or the ability to demonstrate equivalent experience, such as: Terraform Associate About Acorn Insurance More ❯
to have Working within cloud environments (AWS, Azure, GCP) for automation and infrastructure management. Exposure to security compliance frameworks (ISO 27001, CIS benchmarks, NIST). Experience with monitoring and observability tools (Prometheus, Grafana, ELK/EFK stacks). Integration of automation platforms with ticketing systems (ServiceNow, Jira). Hands-on work with container security scanning and remediation processes. Experience in More ❯
Go Significant experience with AWS cloud infrastructure Deep understanding of IaC tools: Terraform, Packer, CloudFormation Proven leadership in multidisciplinary delivery teams Skills in Databases: MongoDB/Atlas; Messaging: Kafka; Observability: Prometheus, Grafana, Splunk Experience working in a DevOps environment, favoring and implementing Continuous Integration & Deployment over manual processes Experience designing, implementing, securing, and supporting Unix/Linux-based platforms (ideally More ❯
Newcastle Upon Tyne, Tyne and Wear, North East, United Kingdom
Anson Mccade
to work in the UK. What You'll Be Doing Automating infrastructure deployment and environment provisioning Managing and optimising CI/CD pipelines Developing tooling for monitoring, diagnostics, and observability Delivering automation solutions for end-user processes Working across both Windows and Linux platforms Collaborating closely with developers and infrastructure engineers to improve system reliability and performance What You'll More ❯
ensure code quality and reliability; Experience of work with Docker for containerisation and application packaging; Experience of implementing and managing monitoring solutions, with experience in Prometheus and Grafana for observability and alerting. Experience of implementing and managing robust security practices, including Encryption (TLS) and Secret Management in the Cloud; Experience of leveraging GitLab API for advanced automation, integration, and reporting More ❯
etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or similar languages The following More ❯
machine learning models and analytical services. Implement and enforce security best practices across cloud and network environments. Troubleshoot deployment and performance issues across multiple environments. Set up and maintain observability tools for logging, monitoring, and alerting (e.g., Prometheus, Grafana, Loki). Contribute to internal tooling to streamline development, testing, and operations workflows. Stay current with DevOps trends and recommend improvements More ❯
Qualifications: Exposure to Retrieval-Augmented Generation (RAG) pipelines, vector databases (e.g., Pinecone, Weaviate, Milvus), and knowledge bases, with familiarity in integrating them with LLMs. Experience with advanced model monitoring, observability, and governance of LLMs and generative AI systems. Experience with data engineering or analytics platforms. Understanding of AI safety, security, and compliance best practices in production. Enthusiasm for learning and More ❯
parity, and system security Support secure configuration of networking, identity, and access management in Azure Help integrate platform components with client environments, participating in deployments, troubleshooting, and documentation Drive observability and resilience across environments using tools like Prometheus, Grafana, and OpenTelemetry Troubleshoot and resolve issues across infrastructure, containers, and deployment pipelines Contribute to internal and client-facing infrastructure documentation and More ❯
products Automate repetitive tasks and create CI/CD pipelines for everything Maintain end-to-end security, ensuring projects meet best practices and Thomson Reuters standards Maintain and grow observability and monitor all aspects of our infrastructure Work closely with product, development, operation and support teams; Guide them towards best practices, share knowledge, and improve the quality of our products More ❯
pipelines Drive platform modernisation Manage a small team of engineers Align DevOps capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Lead DevOps Engineer Requirements Proven line management experience Cloud-native expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
automation scripts (Python, Bash, Shell) and tools (GitLab, Terraform, Vault, Ansible) to streamline deployment, monitoring, and management processes using Infrastructure as Code (IaC). Implement and integrate monitoring and observability solutions, like AIOps, for proactive system issue detection and response. Participate in on-call rotations to ensure 24/7 system availability. Maintain detailed documentation of infrastructure, processes, and procedures More ❯
Patterns for Development Programming languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Analytical and creative approach to problem solving We encourage you to apply , even if you don More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Nordcloud
Patterns for Development Programming languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Analytical and creative approach to problem solving We encourage you to apply , even if you don More ❯
DevOps processes, and accelerate developer code and feature delivery. Experience with Azure Data technologies, such as Azure Data Factory (ADF), to support data integration and pipeline automation. Experience with observability and monitoring tools such as Datadog, Grafana, or the ELK Stack. In-depth knowledge of networking, security protocols, and firewall configurations. Experience with database management and performance optimisation strategies. Familiarity More ❯
working in Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset More ❯
existing systems, ensuring adherence to industry best practices and driving innovation Architect, maintain, and continuously improve our microservices architecture running on Kubernetes in GCP, focusing on scalability, resilience, and observability Drive cost optimisation initiatives across our infrastructure, implementing sophisticated monitoring and analysis tools to identify and execute on efficiency opportunities Design and implement advanced monitoring and alerting systems to ensure More ❯
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
and distributed systems. Proven background in test-driven development, using tools such as Jest, React Testing Library, Cypress, Playwright or Pact. Practical knowledge of CI/CD pipelines and observability tooling - including GitHub, Jenkins, Docker, ELK, Grafana, and Dynatrace. Demonstrated ability to manage, mentor and develop engineers within a delivery-focused team. A collaborative and delivery-focused mindset, comfortable leading More ❯
including Kafka Connect, Kafka Streams, and Schema Registry. Advanced Kubernetes skills, including managing ingress controllers, implementing service mesh solutions (like Istio or Linkerd), and handling stateful applications. Experience implementing observability patterns across distributed systems using tools like Prometheus, Grafana, and distributed tracing solutions. Hands-on experience with infrastructure automation using Terraform specifically for Azure resources. Ability to drive architectural decisions More ❯
shift-left and introduce automation. You will represent the support function on Change Advisory Boards and incident management calls. You will be introducing single-pane-of-glass and transparent observability, monitoring and alerting in a continual improvement regime If Customer orientated, cloud first, cutting edge true IAC, automated deployment, detection and reporting is your wheel-house - this is the role More ❯